Multivariate design is useful for selecting informative training- and validation sets. The essence of this approach is (i) to describe the compounds with many descriptors, (ii) to summarize these descriptors by means of principal component analysis (PCA), and (iii) to create an informative multivariate design in the established PC-scores ("principal properties", "PPs"). This approach has been used in many areas for selecting representative compounds, e.g., organic chemistry, crystallization modelling, environmental chemistry5and QSAR, combinatorial chemistry, and biopolymer sequence modelling.