5 Free Resources Every Data Scientist Should Start Using Today
credit : learning.naukari.com
Four years ago, I recently started my career as a college graduate at a four-person IoT startup. My first task was to conduct research and proposal for an AI-based digital assistant for military settings. Although I studied engineering in college and worked in a lab that helped with machine learning research, it would have been very difficult to start a huge natural language processing project without an experienced data scientist / engineer. In particular, I had to resort to online resources to fill in the blanks and find direction and patrons outside the organization for personal development.
Currently, I am currently working on data infrastructure for our IoT platform and training the company's fullstock engineers and product managers on data science to analyze our large-scale IoT data. This article is a compilation of all the resources I have used and tips I have learned over the years while developing my own career in data science / engineering. Whether you are looking for an engineer in the data industry or preparing for a recent graduate for your new role, I hope my tips will be useful to you.
I started by subscribing to various data science and data engineering newsletters before jumping into popular MOOCs or buying recommended books on Amazon. At first, I read each article and took notes, but over time I learned to identify and focus on some of the important links shared in most newspapers. Newsletters are great for keeping up to date with popular blog posts that share new tools, academic research and major Internet giants (Google, Netflix, Spotify, Airbnb, Uber, etc.).
Here are some of my favorite newsletters:
Tristan Handy's Data Science Roundup: Tristan provided his commentary on the curated list of his data science articles.
Data Science Weekly: Curated list of articles and blog posts related to Data Science, AI and ML. I also found a useful collection of training and resource section online tutorials.
Hacker Newspaper: A weekly magazine with hand-picked articles from Hacker News. It does not specialize in data science / engineering, but has a separate section on data and code.
AI Weekly from VB: Ideas from Venture Beat Authors with a collection of articles related to AI.
I am also a member of Data Mechana, The Analytics Dispatch and AI Weekly.
Create your own data course
Next, depending on your vision, you should design your Data Science, Data Engineer or Data Analyst course. This may include learning how to program in Python or R if you change your career from a non-programming role. If budget is not a concern, joining a boot camp or taking courses from UDCT and DataQuest is a good option to get online mentorship from industry experts. However, if you are price-conscious like me, you may choose to follow the open source guide to create a free course:
- Open Source Society University Data Science
- Andrew Ng's ML Corsera course in Python
- Python Machine Learning Book Gitub Resource
- Hacker's free data engineering resource
- David Venture Free Data Science Master Course
- Data science for startups
- Summary of TopBot's Top AI Research
The one exception here is that taking these courses is not enough. I usually find many courses and tutorials online, walking through a short example focusing on knowledge (e.g. math, statistics, theory) or simplified guidelines. This is especially true for large data because tutorials use a small subset of data to run locally rather than running through the full product setup in the cloud.
To complete the theory with realistic scenarios, I suggest joining Google's paper and using Google's free tools like Collab to work with large datasets. You can also watch the Gitub Repos from Utility students about what the Capstone project looks like.
Network with experts for free
Any career guide will let you know that networking is important. Does an industry expert know how to make a mentor or answer certain questions? Prior to the outbreak, participation in Meetup was an option, but that opportunity was largely limited to residents of major tech hubs (at least in the US) such as the Bay Area, New York or Seattle. Other options include attending meetings or workshops focusing on data science, machine learning or data engineering. However, tickets to these events are so expensive that the company claims it is impossible for people to appear without sponsors.
As a Baltimore-based startup employee, speakers at technology conferences (such as AWS Re: Invent, Microsoft Ignite, or Google Cloud Next) and LinkedIn join online networking by making free videos of sessions previously hosted by industry partners. Together. In addition to keynotes and sessions on new cloud product releases, there are several sessions on best practices and architecture discussions where a key developer from a product partner or industry partner (e.g. Lift, Capital One, Comcast) will demonstrate a solution. Architect at AWS / Azure / GCP in solving the real problem. I will take notes during the session, and then talk to all the speakers on LinkedIn about the architectural decision or their product as described in a question. Surprisingly, almost all the speakers were ready to continue and continue the conversation with me, although I was working on an unknown beginning at the time.
Over time, I have constantly expanded my network in this way and all major cloud providers have taken the added advantage of being up to date with new products and industry trends. With the current situation with COVID-19 and the constant shift towards virtual events, it may become the new standard in networking without having to attend meetings to personally cater to other stakeholders.
Verified Certificate
Although cloud certificates are not valid for capability or data knowledge, I still think there is value in investing in certifications. This is especially true if you are aiming to become a data engineer as cloud knowledge is essential to drive the production workload. Even for data scientists, getting acquainted with cloud products allows you to focus on analyzing data rather than actually loading and cleaning data.
Another underline topic to be verified is opening the network. LinkedIn has very active members, especially in technical consulting, who post about new opportunities in cloud data posts. Some appointments are posted directly to LinkedIn groups only for certificate holders. Verification may not only create a new job or location, but having those badges makes it easier to start conversations with others or employers. Personally, I did a few small consulting projects after I got certified.
Solve real problems
Finally, as in any engineering department, you can only improve with practice. If you are already working as a Data Scientist or Data Engineer, getting real world experience is not a problem. For others looking for a transition, many recommend creating a portfolio. But where do you start? Working with classic Titanic datasets for classification or clustering of survival for iris datasets can help your portfolio.
Instead, try to use public github projects as inspiration. See what others are building, based on the network you gather from LinkedIn through technical sessions and certifications. Feel free to use examples from the Udness or Kurtsera projects in Gitub. Then combine real research datasets from Google Research, Cogley or search for interesting datasets and start building solutions to real problems.
If you are interested in a field or a specific organization, search for a public dataset and try to create a sample project. For example, if you are interested in fintech, try using the lending club's public debt data to create a loan approval algorithm. The biggest way to work with real datasets is that they are much more confusing and noisy than what is provided in the academic settings.
There are clearly many ways to move your career forward and dent in the data industry. I have no data professionals and I am still learning and growing every day. If you have other resources, tips or advice, please leave a comment below and I will update this post to help others further enhance their data career.
2 Comments
Interesting article
ReplyDeleteData Science Online Training
data science online free
ReplyDeleteBest Data Science Online Training
If you have any doubt, Please let me know.