Learning from Cross-Modality Data for Image Semantics Understanding, Description, Synthesis, & Manipulation