Large diffusion models have made a remarkable leap in synthesizing high-quality artistic images from text descriptions. However, these powerful pre-trained models still lack control over key material appearance properties such as gloss. In this work, we present a threefold contribution: (1) we analyze how gloss is perceived across different artistic styles (i.e., oil painting, watercolor, ink pen, charcoal, and soft crayon); (2) we leverage our findings to create a dataset with 1,336,272 stylized images of many different geometries in all five styles, including automatically computed text descriptions of their appearance (e.g., “A glossy bunny hand painted with an orange soft crayon”); and (3) we train ControlNet to condition Stable Diffusion XL for synthesizing novel painterly depictions of new objects, using simple inputs such as edge maps, hand-drawn sketches, or clip art. Compared to previous approaches, our framework yields more accurate results despite the simplified input, as we show both quantitatively and qualitatively.