The Role of Autoencoders and Centroid Analysis in Predicting Startup Outcomes

:::info
Authors:

(1) Mark Potanin, a Corresponding (authorpotanin.m.st@gmail.com);

(2) Andrey Chertok, (a.v.chertok@gmail.com);

(3) Konstantin Zorin, (berzqwer@gmail.com);

(4) Cyril Shtabtsovsky, (cyril@aloniq.com).

:::

Table of Links

Abstract and 1. Introduction

2 Related works

3 Dataset Overview, Preprocessing, and Features

3.1 Successful Companies Dataset and 3.2 Unsuccessful Companies Dataset

3.3 Features

4 Model Training, Evaluation, and Portfolio Simulation and 4.1 Backtest

4.2 Backtest settings

4.3 Results

4.4 Capital Growth

5 Other approaches

5.1 Investors ranking model

5.2 Founders ranking model and 5.3 Unicorn recommendation model

6 Conclusion

7 Further Research, References and Appendix

5 Other approaches

5.1 Investors ranking model

All investors could be scored in terms of frequency, amount, and field of investments. Also, an investor could be an indicator of a company’s potential failure or success. This scoring was carried out in three stages:

Through an autoencoder model with several modalities, we created vector representations for each investor

According to experts’ estimates, we select a group of top investors, and further create the centroid of this group in the vector space

We rank investors according to distance from the centroid


An elevated score corresponds to a proximate alignment with top investors. Results are presented in Table 4. If the lead investor of a company has a low score, it could be an indicator that such a company should be excluded from consideration.


Example: Company 14W has a score of 0.9 and invests in IT companies, incl. unicorns (for example, European travel management startup TravelPerk).

:::info
This paper is available on arxiv under CC 4.0 license.

:::

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.