Mithrandir
Key Player
Good afternoon friends.
As some of you are aware, I'm currently a graduate student in computer science, concentrating in data analytics and machine learning. Most of you probably know that I'm very interested in footie data and in the past enjoyed doing explanatory analysis.
For my graduate degree, I need to complete a "capstone" research project. Given that, it seemed logical to base my project off something that I actually give a shit about, which led to the formulation of my ambitious project.
I want to create a system that can automatically scan a footie stats database in order to collect relevant data and store the information. From there, the system will use the data combined with manually generated metrics derived from that data to develop a predictive expected goals scored vs expected goals conceded model. I'll probably start using standard regression and see how that turns out. It will probably be better to use some more advanced machine learning model, but this will be developed later down the line. Using some other machine learning method (Bayesian Deep Learning?), is almost certainly going to be necessary in the long-run to outperform human predictors.
Once I get to this point, it's time to heavily test and improve the model on new, actual matches. I envision getting the model workable, and then potentially entering my system into the Prediction League (next year, not this year), to see how it performs relative to knowledgeable football fans (that's you lot), with an eye towards continued improvement of the predictive model.
If I get to that point, I would like to add an additional module to the system. I would like to add a system that can scan and record the information from betting sites about each game the system has predicted. From there, it would compare this betting data with the predictive model and make recommendations to maximize expected winnings.
I'm hoping to have the standard regression model ready for next season. To get to this point, I'm going to have to brush up on my Python a bit to create a couple of good web crawlers and program the predictive model itself. This will take some time, since I still am working full-time, but having something in place by next summer seems very doable. Once that model is in place, I will work on developing a better machine learning model alongside it.
I apologize for the random, technical, meandering post. I've been thinking about this for a while and wanted to type out my thoughts. As I mentioned, it's definitely an ambitious project, but I think it's definitely possible to do with enough time.
As some of you are aware, I'm currently a graduate student in computer science, concentrating in data analytics and machine learning. Most of you probably know that I'm very interested in footie data and in the past enjoyed doing explanatory analysis.
For my graduate degree, I need to complete a "capstone" research project. Given that, it seemed logical to base my project off something that I actually give a shit about, which led to the formulation of my ambitious project.
I want to create a system that can automatically scan a footie stats database in order to collect relevant data and store the information. From there, the system will use the data combined with manually generated metrics derived from that data to develop a predictive expected goals scored vs expected goals conceded model. I'll probably start using standard regression and see how that turns out. It will probably be better to use some more advanced machine learning model, but this will be developed later down the line. Using some other machine learning method (Bayesian Deep Learning?), is almost certainly going to be necessary in the long-run to outperform human predictors.
Once I get to this point, it's time to heavily test and improve the model on new, actual matches. I envision getting the model workable, and then potentially entering my system into the Prediction League (next year, not this year), to see how it performs relative to knowledgeable football fans (that's you lot), with an eye towards continued improvement of the predictive model.
If I get to that point, I would like to add an additional module to the system. I would like to add a system that can scan and record the information from betting sites about each game the system has predicted. From there, it would compare this betting data with the predictive model and make recommendations to maximize expected winnings.
I'm hoping to have the standard regression model ready for next season. To get to this point, I'm going to have to brush up on my Python a bit to create a couple of good web crawlers and program the predictive model itself. This will take some time, since I still am working full-time, but having something in place by next summer seems very doable. Once that model is in place, I will work on developing a better machine learning model alongside it.
I apologize for the random, technical, meandering post. I've been thinking about this for a while and wanted to type out my thoughts. As I mentioned, it's definitely an ambitious project, but I think it's definitely possible to do with enough time.