BIG DATA
SPECIALIST
Take the Lead
in Big Data Revolution

APPLY

3-month intensive program from industry leaders

The program is centered around two business cases, which are broad classes of Big Data problems — Data Management Systems and Recommender Systems. The students are learning algorithms and tools as they go step-by-step through the two cases writing all the code to load, clean, process terabytes of real-life data on an industrial Azure cluster.

You will acquire the following skills:

  • Creating cloud-clusters for big data analysis;
  • Source code development (from data collection
    and structuring to analysis and visualization);
  • Solving a particular business case using efficient
    set of technologies and algorithms

TIMING

3 month 3 times a week
26 Mar - 27 Jun 2019

SCHEDULE

Mon, Wed 19.00-22.00
Sat 11:00-13:00

PARTICIPATION FORMAT

Regular classes at GVA
Interactive Online

INTERACTIVE ONLINE

Multicam webcast and 
Live face-to-face discussion
Classes recorded for future reference

PRICE

160 000 i

Instructors

  • Anton Pilipenko

    Big Data Engineer, Mail.ru Group

    Nikolay Markov

    Senior Data Engineer, Aligned Research Group

    Pavel Klemenkov

    Chief Data Scientist (marketing), Sberbank

  • Petr Ermakov

    Head of Data Analytics, Youla at Mail.Ru Group

    Dmitry Ignatov

    Deputy Head at School of Data Analysis and Artificial Intelligence, HSE

  • Alexander Ulyanov

    Data Science Executive Director, Sberbank

    Oleg Khomyuk

    Head of R&D, Lamoda

    Alexander Filatov

    Product Analytics Manager, VISA

    Alexander Petrov

    Sr. Software Development Engineer, Amazon

Program

 
MODULE 1

Data Management Platform

During first module you will study how
to use Hadoop, HBase, Hive stack for big data processing and use machine learning methods
for user classification

Teacher Lead instructor:
Alexander Petrov,
Director R&D, Data-Centric Alliance


Teacher Instructor:
Alexandеr Krot,
Lead Data Scientist, «VimpelCom»

Map-Reduce, Hadoop, HDFS

Deploying a Hadoop cluster
in Amazon Web Services cloud

HBase, Hive, Pig

The tricks of Hbase practical application
Maxim Lapan, lead engineer,
RIPE NCC (Holland), ex-Mail.Ru Group

Big Data in banking
Andrzej Arshavsky, Director for Big Data, Sberbank-Technologies

Workshop: optimizing
MapReduce tasks in Hadoop

Colloquium

Storing and processing of 1TB web logs and HBase

Data visualization using Hue
Work in iPython

Creation of multi-class classifier

Machine learning
Working with Python libraries

Determining sex/age of the visitors
by their behavior in the network

Laboratory work

Deploying a Hadoop
cluster in Azure cloud


Lab

Lab

Lab

Lab

Lab
 
MODULE 2

Recommender systems

Second and final module is dedicated to the recommender systems design and content personalization, focused on data analysis
and machine learning.

Teacher Instructor:
Gregory Sapunov,
The founder of the project eclass.cc,
The former head of development of Yandex.News


Teacher Instructor:
Dilara Khakimova,
Former deputy head
of monetization research, Yandex

Non-Personalized
recommender systems

Laboratory work
Non-Personalized movie recommender system

Content-based recommender system

Text processing

Workshop
Practical methods of text analysis:
TF*IDF, vector model, clustering, classification

Wouter de Bie and Big Data Architect recommender systems workshop by Spotify
Colloquium

Colloquium

Laboratory work. Content-based online
courses recommender system

Collaborative filtering
User-user, item-item

Laboratory work. Movie recommender system with collaborative filtering using item-item
and user-user approaches

Collaborative filtering
Matrix factorization and dimensional reduction

Laboratory work. Film recommender systems
with collaborative filtering using SVD

Algoritms

TF-IDF

Term Frequency-Inverse
Document Frequency


Algorithm for determination of the proximity of two documents


Lab

Pearson correlation
coefficient

It is used to determine the connection between two
random variables.


Lab

SVD

Singular Value Decomposition


Singular Value Decomposition
is a mathematical approach, which is used to identify
hidden factors and dimensional reduction of data, in particular
to find other users with similar patterns.


Lab

Skills required for this course

  • Skills in high-level
    programming languages,
    especially in Python 2
  • Linux basic
    knowelege
  • Basic
    understanding
    of SQL
  • Knowledge of probability
    and statistics at the level
    of 1-2 semesters in college
  • For those who are not sure that their
    proficiency is good enough

    Complementary workshops on Linux and Python basics are
    organized in the first two weeks of the program
Up

Career

7 years of experience, 27 recruiters, 239 clients

Working with Spice IT Recruitment, a leading recruiting agency in IT field, we are sure that our alumni will be fully prepared to interviews with future employers and will get the best offers that suit their knowledge, experience and potential

Where our alumni work and live

Media partners

  •  
  •  
  •  
  •  
I NEED MORE INFORMATION

We received your request.
Thank you!