Advancements in technology are yielding datasets of increasingly larger size. As one example of such "high-dimensional" data, we consider predicting a clinical outcome (e.g. survival time after removal of a tumor) using measurements of gene expression. We have two datasets, each of which measures gene expression using a different platform. One set of measurements is more precise than the other. Our primary goal is to combine these datasets so as to make the best possible predictions for future observations. The talk uses ideas primarily from linear regression, a fundamental but relatively simple model fitting technique, and I will have several introductory slides for those unfamiliar with it. And for students contemplating graduate school, I will shamelessly promote the field of biostatistics.