Where I start
I want to share the experience about how I prepare data scientist. Here’s my background. All I want you know is I’m normal and not lucky guy. I work in school as Research assistance for 6 months and applied 1300 different positions. If I restart it again, I might need shorter time to get the offer.
- 3 Years firmware developing( coding c for microchip)
- NLP research experience with no paper publication
- Top 100 CS school
- 3.6 GPA
- International student. No GC, No family in U.S.
It’s easy to have better starting situation than my case, so don’t worry.
Check LIST
The basic preparation you should finish to gain the basic knowledge to overcome interview. Be prepare for those steps.
- Leetcode
- Sklearn API
- The questions in this blog :P
- Deep learning
Leetcode
In the beginning I don’t believe this is necessary for DS, till I failed at least 5 different interviews. The reason why company want to do the code challenge for DS position is to prevent fake profile. There are many institution have expert to train the candidates on pretending to be senior developer. Coding challenge is a efficient way to get rid of fake expert.
Our goal is clear. Don’t put too much time on practicing this, however, we can still show the ability on coding.
Pick the tag for one company. I chose facebook as my target. Any company is fine but FB has proper number questions and cover most of data structures. 200 questions isn’t something big number to finish.
Try to eliminate the questions by data structure instead by difficulty. It will help you build the feeling to guess possible solution for the question. In my experience when you finish 3~5 questions in one data structure you will know how to build the basement to solve the problem.
How to solve problem in proper way? I have a study group. We all face the situation is that sometime we’ll get stuck by one question for hours.
Should I check solution before I actual pass some test case?
I think is NO, Especially when you start doing another type of questions. Always set a timer, likes 20 minutes. If you don’t have any idea to start, don’t hesitate to check the discussion and solution. There’s another problem will come out.
** Are you really really understand ?**
It’s too easy to fool yourself. There is only a fine line between a understand and copy. Do it again next day is the best way to challenge yourself.
Sklearn document
Here we are. The specific knowledge for data scientist. I don’t how’s structure for Machine learning course of other school. It’s hard to follow step by step review list for the lecture.
Sklearn is perfect tool and list let you go through most of ML algorithm. I believe there’s more algorithm, however, to know the common ML algorithm is essential.
Here’s screenshot for partial API document. Most of function I believe a qualified data scientist should know most of that.
It’s Very important to know more than school. That’s the reason I failed many times. Remember one thing: Every parameter is trying to solve some problems. I take logistic regression as an example, class_weight is one way to solve unbalanced data. I never heard about that in Machine learning class. There’s never wonderful data sets in real problem.
Deep learning
The quickest way is to play at least one hand-on project for deep learning on Colab. It’s good process to spend some time on how to build up environment for training deep learning model and use GPU, however, I don’t think it’s necessary on interview.
The beginner friendly course is on Coursera. There are two ways to get full content of courses.
- Pay the money. It charge monthly and you can finish in 1 month.
- Apply for financial aid. It might take an month to get approved. If you are a student, you probably get approved.
Make sure to understand every line for the chapter final project. It’s too easy to cheat yourself on this. Don’t try to get certification as fast as you can. Please Don’t do it.
I have no idea about pytorch. Keras and tensorflow have well documented API. I suggest to understand the example provided by tensorflow.
DS question
While I start to prepare the interview, I wonder if there’s a list can be checked one by one. Once I finish allI spend ton of times to crack the DS question, I’ll pass most of interview. I don’t think it should be there
Most exciting ints. It’s serview is coming from Amazon. I applied the application engineer, not a data scientist. A good interview should have much more knowledge base than you, either in breadth or depth of knowledge. The questions should push you to the edge of your knowledge. It’s like exploring unknown dugent. Try to get as far as you can.
However there’s still really basic questions, some of the job seeker can’t answers trial and errors. However it turn out few scenario on designing question.
How to apply job
Resume
There’s very very common issue is that Do I have perfect resume?. I don’t think there’s one legendary principle can help us build perfect resume. Just like we don’t think there’s perfect girl(boy) in the world. It vary by company system.
Networking v.s. Apply as much as you can
We all know networking is very important. You’ll always heard someone said you shouldn’t waste your time on apply many jobs but spend on networking. I don’t agree with this. We can be greedy and shoot the both target. To do networking has peak period , you can arra
Interview
There’s are few general types for interview.
- Phone interview
- Online assessment
- Machine learning case study
- On-site( I can’t tell, since I haven’t beinge the schedule to achieve both.
on-site).