DATA

ABOUT ME

Bryan Kelly is Professor of Finance at the Yale School of Management, a Research Fellow at the National Bureau of Economic Research, Associate Director of SOM’s International Center for Finance, and is the head of machine learning at AQR Capital Management. Professor Kelly’s primary research fields are asset pricing, machine learning, and financial econometrics. He is interested in issues related to expected return, volatility, tail risk, and correlation modeling in financial markets; financial sector systemic risk; financial intermediation; and financial networks.  He has served as co-editor of the Journal of Financial Econometrics and associate editor of Journal of Finance and Journal of Financial Economics. Before joining Yale, Kelly was a tenured professor of finance at the University of Chicago Booth School of Business.  He earned an AB in economics from University of Chicago, MA in economics from University of California San Diego, and a PhD and MPhil in finance from New York University’s Stern School of Business. Kelly worked in investment banking at Morgan Stanley prior to his PhD.

Kelly_B_3871_1193-Edit-Crop.png

Go to jkpfactors.com to download factor portfolio return data for the 153 factors in 93 countries studied in "Is There A Replication Crisis In Finance?" by Jensen, Kelly, and Pedersen (2021) in The Journal of Finance

Our 
Github Code Repository provides code to produce all underlying stock-level signals.  For researchers with a WRDS account, this SAS code runs on the WRDS server to produce 406 characteristics (including the 153 in our paper) and the associated factor portfolios in 93 countries.


Our Documentation (.pdf) describes data contents in detail and provides step-by-step explanation of each variable's construction.

To 
Request Additional Data
ask any questions about our data, or report any issues, please email me.

This website analyzes results and provides data based on "Business News and Business Cycles" by Bybee, Kelly, Manela, and Xiu (2020).

INTERMEDIARY ASSET PRICING

Intermediary capital risk factor, 1970Q1–2018Q3 based on "Intermediary Asset Pricing: New Evidence From Many Asset Classes" by He, Kelly, and Manela (2017) in The Journal of Financial Economics.  Quarterly, monthly, and starting 2000-01-01 daily too. Also includes portfolio returns used in our cross-sectional tests. See readme.txt inside for details and replication code.  Courtesy of Asaf Manela.  Some of these series are updated more frequently by Zhiguo He and are available here.

Data and documentation for corporate bond risk factors estimated via IPCA as in "Modeling Corporate Bond Returns" by Kelly, Pruitt, and Palhares (2022) in The Journal of Finance.

 

PUBLISHED ARTICLES

Journal of Finance, Forthcoming (J. Jiang and D. Xiu)

Annual Review of Financial Economics, In Process (with S. Giglio and D. Xiu)

Journal of Finance, Forthcoming (with D. Palhares and S. Pruitt)

Journal of Financial Economics, Forthcoming (with M. Buechner)

Journal of Finance, Forthcoming (with T. Jensen and L. Pedersen)

Journal of Finance, Forthcoming (with S. Malamud and L. Pedersen)

Journal of Business and Economic Statistics, Invited Paper (with A. Manela and A. Moreira)

Annual Review of Financial Economics, Forthcoming (with S. Giglio and J. Stroebel)

Journal of Financial Economics, Forthcoming (with I. Dew-Becker and S. Giglio)

Journal of Financial Economics, 2021 (with S. Pruitt and T. Moskowitz)

American Economic Review, Insights, Forthcoming (with D. Papanikolaou, A. Seru and M. Taddy)

Journal of Political Economy, 2021 (with B. Herskovic, H. Lustig and S. Van Nieuwerburgh)

Journal of Investment Management, 2020 (with R. Israel and T. Moskowitz)

Journal of Financial Economics, 2020 (with Y. Chen and W. Wu)

Journal of Portfolio Management, 2019 (with T. Gupta)

Journal of Econometrics, 2021 (with S. Gu and D. Xiu)

Review of Financial Studies, 2020 (with S. Gu and D. Xiu)

Review of Financial Studies, 2020 (with R. Engle, S. Giglio, H. Lee and J. Stroebel)

Journal of Financial Economics, 2019 (with S. Pruitt and Y. Su)

Journal of Economic Literature, 2019 (with M. Gentzkow and M. Taddy)

Quarterly Journal of Economics, 2018 (with S. Giglio)

Journal of Financial Economics, 2017 (with Z. He and A. Manela)

American Economic Review, 2016 (with H. Lustig and S. Van Nieuwerburgh)

Journal of Finance, 2016 (with L. Pastor and P. Veronesi)

Journal of Financial Economics, 2016 (with B. Herskovic, H. Lustig and S. Van Nieuwerburgh)

Journal of Financial Economics, 2016 (with S. Giglio and S. Pruitt)

Review of Financial Studies, 2014 (with H. Jiang)

Journal of Finance, 2014 (with K. Balakrishnan, M. Billings and A. Ljungqvist)

Journal of Finance, 2013 (with S. Pruitt)

Review of Financial Studies, 2012 (with A. Ljungqvist)

Journal of Business and Economic Statistics, 2012 (with R. Engle)

Journal of Risk, 2011 (with C. Brownless and R. Engle)

WORKING PAPERS

 

(with T. Jensen, S. Malamud, and L. Pedersen)

We develop a framework that integrates trading-cost-aware portfolio optimization with ML. While numerous studies use ML return forecasts to generate portfolios, their agnosticism toward trading costs leads to excessive reliance on fleeting small-scale characteristics, resulting in poor net returns. We propose that investment strategies should be evaluated based on their “implementable efficient frontier,” and show that our method produces a superior frontier. The superior net-of-cost performance is achieved by integrating ML into the portfolio problem, learning directly about portfolio weights (rather than returns). Lastly, our model gives rise to a new measure of “economic feature importance.”

(with S. Malamud and K. Zhou)

We theoretically characterize the behavior of machine learning portfolios in the high complexity regime, i.e. when the number of parameters exceeds the number of observations. We demonstrate a surprising “virtue of complexity”: Sharpe ratios of machine learning portfolios generally increase with model parameterization, even with minimal regularization. Empirically, we document the virtue of complexity in US equity market timing strategies. High complexity models deliver economically large and statistically significant out-of-sample portfolio gains relative to simpler models, due in large part to their remarkable ability to predict recessions.

(with S. Malamud and K. Zhou)

We document the "virtue of complexity" in all asset classes that we study (US equities, international equities, bonds, commodities, currencies, and interest rates). Return prediction R-squared and optimal portfolio Sharpe ratio generally increase with model parameterization for every asset class. The virtue of complexity is present even in extremely data-scarce environments, e.g., for predictive models with less than twenty observations and tens of thousands of predictors. The empirical association between model complexity and out-of-sample model performance exhibits a striking consistency with theoretical predictions.

(with A. Didisheim and S. Malamud)

We introduce a methodology for designing and training deep neural networks (DNN) that we call "Deep Regression Ensembles" (DRE).  It bridges the gap between DNN and two-layer neural networks trained with random feature regression.  Each layer of DRE has two components, randomly drawn input weights and output weights trained myopically (as if the final output layer) using linear ridge regression.  Within a layer, each neuron uses a different subset of inputs and a different ridge penalty, constituting an ensemble of random feature ridge regressions.  Our experiments show that a single DRE architecture is at par with or exceeds state-of-the-art DNN in many data sets.   Yet, because DRE neural weights are either known in closed-form or randomly drawn, its computational cost is orders of magnitude smaller than DNN.

(with L. Bybee and Y. Su)

We seek fundamental risks from news text. Conceptually, news is closely related to the idea of systematic risk, in particular the "state variables" in the ICAPM. News captures investors' concerns about future investment opportunities, and hence drives the current pricing kernel. This paper demonstrates a way to extract a parsimonious set of risk factors and eventually a univariate pricing kernel from news text. The state variables are reduced and selected from the variations in attention allocated to different news narratives. As a result, the risk factors attain clear text-based interpretability as well as top-of-the-line asset pricing performance. The empirical method integrates topic modeling (LDA), latent factor analysis (IPCA), and variable selection (group lasso).

(with L. Bybee, A. Manela, and D. Xiu)

(Formerly "The Structure of Economic News.") We propose an approach to measuring the state of the economy via textual analysis of business news. From the full text content of 800,000 Wall Street Journal articles for 1984–2017, we estimate a topic model that summarizes business news as easily interpretable topical themes and quantifies the proportion of news attention allocated to each theme at each point in time. We then use our news attention estimates as inputs into statistical models of numerical economic time series. We demonstrate that these text-based inputs accurately track a wide range of economic activity measures and that they have incremental forecasting power for macroeconomic outcomes, above and beyond standard numerical predictors. Finally, we use our model to retrieve the news-based narratives that underly “shocks” in numerical economic data.

We introduce a new text-mining methodology that extracts sentiment information from news articles to predict asset returns. Unlike more common sentiment scores used for stock return prediction (e.g., those sold by commercial vendors or built with dictionary-based methods), our supervised learning framework constructs a sentiment score that is specifically adapted to the problem of return prediction. Our method proceeds in three steps: 1) isolating a list of sentiment terms via predictive screening, 2) assigning sentiment weights to these words via topic modeling, and 3) aggregating terms into an article-level sentiment score via penalized likelihood. We derive theoretical guarantees on the accuracy of estimates from our model with minimal assumptions. In our empirical analysis, we text-mine one of the most actively monitored streams of news articles in the financial system—the Dow Jones Newswires—and show that our supervised sentiment model excels at extracting return-predictive signals in this context.

We use a large cross-section of equity returns to estimate a rich affine model of equity prices, dividends, returns and their dynamics. Using the model, we price dividend strips of the aggregate market index, as well as any other well-diversified equity portfolio. We do not use any dividend strips data in the estimation of the model; however, model-implied equity yields generated by the model match closely the equity yields from the traded dividend forwards reported in the literature. Our model can therefore be used to extend the data on the term structure of discount rates in three dimensions: (i) over time, back to the 1970s; (ii) across maturities, since we are not limited by the maturities of actually traded dividend claims; and most importantly, (iii) across portfolios, since we generate a term structure for any portfolio of stocks (e.g., small or value stocks). The new term structure data generated by our model (e.g., separate term structures for value, growth, investment and other portfolios, observed over a span of 45 years that covers several recessions) represent new empirical moments that can be used to guide and evaluate asset pricing models.

(with S. Pruitt and Y. Su)

Econometric development of the IPCA method used in ''Characteristics Are Covariances: A Unified Model of Risk and Return ''

(with R. Israelov)

Uncertainty about the future option return has two sources: Changes in the position and shape of the implied volatility surface that shift option values (holding moneyness and maturity fixed), and changes in the underlying price which alter an option's location on the surface and thus its value (holding the surface fixed). We estimate a joint time series model of the spot price and volatility surface and use this to construct an ex ante characterization of the option return distribution via bootstrap. Our ''ORB'' (option return bootstrap) model accurately forecasts means, variances, and extreme quantiles of S&P 500 index conditional option return distributions across a wide range of strikes and maturities.

CONTACT

Bryan Kelly

Yale School of Management

165 Whitney Ave. 

New Haven, CT 06511

bryan.kelly@yale.edu

203-432-2221