TLD Vision an AI company focusing on tracking of objects in video. We are building a new tracking engine which combines latest advances from deep neural networks, real-time rendering and 3D computer vision. Our goal is to enable new kinds of applications in broadcasting and augmented reality which require extreme robustness and precision.

3D Modeling

Tracking using a bounding box reached is potential already, future trackers require more accurate representation. We use a 3D model for that. This model can be general (human head), or very specific (Porsche 911). We don't care.

Rendering Engine

Our rendering engine is used to draw graphics into video stream, augment training set with synthetic examples, and most importantly, it provides a bridge between CV and CG worlds. It is hard for us to draw a clear boundary between rendering engine and our custom neural network.

Neural Network

We created a custom neural network which is tightly interlinked with our rendering engine as well as annotation tool. We never use external frameworks, they are too rigid for our purpose and hide important details from our development process.

Video Annotator

Data is key. When possible we simulate such data using our rendering engine. But for mission critical tasks we rely on extremely efficient annotation tools which allow to annotate 20k samples per day using a single annotator. Our annotation tool is addictive and feels like a game.

The company was founded by Dr. Zdenek Kalal in 2011 after a world-wide recognition of his PhD thesis. He is considered an expert in the field of object tracking and still actively participates in prestigious scientific conferences.


2011, TLD2, ICT Pioneers Prize
2017, TLD3, NVIDIA Inception Program Contest