Analysis of Medium Class
Whereas the original YOR "Medium" field had four possible values (Hydro, Soil, Fortified and Tron), we created four dummy variables from this single original variable, because we need quantitative variables in order to calculate a correlation with the quantitative variable of crop yield. So we create one dummy variable for each category of the original variable, and these four dummy variables have a code of either 1 or 0: If Hydro medium was used, then the Hydro variable is assigned a value of 1; if Hydro medium was not used, then the Hydro variable is assigned a value of zero. Similar assignments are made for the other three dummy variables, so that if one of these four dummy variables has a value of 1 for a given grow report, then the other three must have values of zero.
Here the simple correlations suggest that a hydroponic medium has the most beneficial effect on crop yield (a modest positive correlation), and soil has the least beneficial effect (a modest negative correlation), with fortified and tron falling somewhere in between these two extremes. In each case, we used "dummy" variables in our correlation analyses.
Medium

Correlation  Significance 
Fortified

.078  .335 
Hydro

.366**  .000 
Soil

.339**  .000 
Tron

.118  .146 
**Correlation is significant at the 0.01 level 
However, simple correlations can potentially be misleading, because they do not take into account or "control for" other differences besides medium between the various grow reports in our analytic sample. For example, it might be that hydro growers also tend to use HPS lighting, while soil growers tend to use fluorescent lighting. So we must be careful not to jump to conclusions just yet. Here we are simply trying to eliminate unimportant influences from our subsequent predictive modeling efforts. So, for example, based on these simple correlations, we will carry the Hydro and Soil dummy variables into a later round of modeling, because they show significant simple correlations with crop yield. But we will drop the other two dummy variables from future analyses, because they are not statistically significant.