Markov switching models with time-varying means, variances and mixing weights are applied to characterize business cycle variation in the probability distribution and higher order moments of stock returns. This allows us to provide a comprehensive characterization of risk that goes well beyond the mean and variance of returns. Several mixture models with different specifications of the state transition are compared and we propose a new mixture of Gaussian and student-t distributions that captures outliers in returns. The models produce very similar expected returns and volatilities but imply very different time series for conditional skewness, kurtosis and predictive density. Consistent with economic theory, the gains in predictive accuracy from considering two-state mixture models rather than a single-state specification are higher for small firms than for large firms.