Afbeelding auteur

Marcos M. López de Prado

Auteur van Machine learning for asset managers

2 Werken 14 Leden 1 Geef een beoordeling

Werken van Marcos M. López de Prado

Tagged

Algemene kennis

Er zijn nog geen Algemene Kennis-gegevens over deze auteur. Je kunt helpen.

Leden

Besprekingen

I got a lot of value out of this book. I had never thought about how we can adjust the closing price to account for intraday volatility, or about how most Sharpe ratios are inflated because they don't correct for multiple backtests, or about how we can use the Sharpe ratio to size our bets. I'm glad I bought this book and read it through. Also, it's great that the author provides code snippets.

That said, this book needs a major revision. The writing is unacceptably poor. At times it's hard to know who is doing what. Look at this paragraph:

The goal of meta-labeling is to train a secondary model on the prediction outcomes of a primary model, where losses are labeled as “0” and gains are labeled as “1.” Therefore, the secondary model does not predict the side. Instead, the secondary model predicts whether the primary model will succeed or fail at a particular prediction (a meta-prediction). The probability associated with a “1” prediction can then be used to size the position, as explained next.


When he says "...where losses are labeled as '0' and gains are labeled as '1'" is he referring to the primary model or to the secondary model? And the "probability associated with a '1' prediction' is something I can get from the primary model itself - why do I need a secondary model here? The paragraph ends with "as explained next" but what comes next is how to use these probabilities (whose model's?) to size bets, not how the secondary model fits into the picture.

Another issue is that the author doesn't bother to justify many of his choices. Like in his discussion of mean-decrease accuracy, for instance. He suggests shuffling the feature (thus breaking the sample-feature correspondence) to see how much that decreases the model's performance. Sounds reasonable, but what's the advantage of doing that versus simply dropping the feature altogether?

Relatedly, it would be great to see some real-world examples in the book. The code snippets are super helpful (and often necessary to understand the point, given the poor writing). But they all use synthetic data. I'm sold on the "Monte Carlos beat p-values" approach, but real-world examples would give us a sense of how much better the authors' solutions are, compared to the ones he is criticizing.

Finally, the book is missing a chapter on forecasting models. It's great to learn about labeling, feature importance, etc, but the point of all these concepts is to help us produce better forecasts. Where is the chapter that discusses and compares ARIMAX, LSTM, Markov, etc? It feels like the most important topic has been left out of the book.

Oh, just one more thing: buy the paper version, not the Kindle version. The author/publisher didn't bother to save the file with UTF-8 encoding and as a result every special character is mangled:



If you google around you can easily find a PDF of the book. In the end that's what I had to resort to (I did it with a clean conscience, since I've paid for the book).
… (meer)
 
Gemarkeerd
marzagao | Jun 1, 2021 |

Statistieken

Werken
2
Leden
14
Populariteit
#739,559
Waardering
3.0
Besprekingen
1
ISBNs
2