Today’s articles turned out to kind of resemble very general phases of data science project.
First, there’s the data collection, after that feature engineering, modelling and later it’s evaluating the performance. Of course, it isn’t only very general. There is much more than that: e.g. formulating a business problem, formalizing it in mathematical terms (so that it can be modeled), EDA, data preparing and cleaning, implementation issues or maintenance and control. Additionally, they aren’t in the linear order as the feature engineering and modelling are iteratively repeated to try different approaches and combine them.
Often data is provided by the client, so there’s no data collection phase, but even then problems described in the article arrive at this seemingly trivial task.
Feature engineering is one of the keys to successful modeling. Here is very basic tutorial to get the grasp of what it is.
There are many ml models, but gradient boosted trees are dominating now classic modelling contest on kaggle. This is the best explanation I know on the internet (and in the next article on this page there’s a cool interactive gradient boosting simulation).
Although neither precision nor recall are being directly optimized they are often used to get overview of the performance and explain the results.
Microservices are not as good as they’re presented
How one of Web Component standards is working? Read what you can find under the hood and how to use it when creating reusable components.
Basic architecture concepts you should be familiar with as a web developer
Written by: Bartosz Cłapa