learn R for data science online

This online Data Science course is structured in understandable & easy progressive steps so that every topic can be easily understood.

No previous knowledge of Data Science topics is required.

We start from zero and gradually build up using guided projects with real time data sets.

While being accessible, the course also introduces you to most of the advanced topics you need when working on Data Science projects.

Natural Language Processing with Python

In this post, we will learn to perform Natural language processing with Python.
Natural language processing, also called NLP, is the ability of a software program to understand human language.
NLP can be done with Python using NLTK, Natural Language Tool Kit. Gensim is one of the most commonly used libraries within NLTK.
We will learn to use Gensim dictionaries and Tf-Idf Model.

Python for data science course online

Image title

The aim of this post is to provide a quick reference to people who want to learn Python for Data Science.As you know, Python is a general purpose programming language and is not specific to data scienceworld.So there will be lot of topics to learn for a beginner.Learning every topic in Python might take considerably long time.

How to install Python on MacOS

For MacOS, follow the steps given below:

1.Click on Download Python 3.7.0

python download2.After the python-3.7.0.pkg file gets downloaded, open the installer and follow step by step instructions.

3.Finally you should get the below message, as shown in the screenshot below.

Where should I write Python code and run it ?

Python IDLE editor

In order to write and run Python , you can use an IDE(Integrated Development Environment) like PyCharm or IDLE (Integrated Development and Learning Environment) or  Jupyter Notebook.

Since we have just started learning , let’s use IDLE, as it is the simplest of these three editors.

To launch IDLE, go to your Python Installation directory, there you will find Lib folder.

Go inside Lib folder and you will find idlelib folder.Double click idle.bat file within idlelib folder and then IDLE will start, like shown in the screenshot below.

Let us declare a variable i ,assign 100 to it and press Enter. The variable i will now contain 100.Let’s verify by printing the variable i using print(i)

IDLE is good for practicing simple examples.

If you want to write code within IDLE which contains if clause or loops ,you need to be aware of indentation concepts.

To overcome this, we need an editor which does indentation automatically when we type the code. This is where Jupyter Notebook will be helpful.

Contact Us :

Mail : honingds01@gmail.com

Website : https://honingds.com/python
Python for data science course online

Data science training Los Angeles CA

Image title

I am an online Data Science trainer, have huge experience in Data Science, Python, R, Scala and Big Data Stack.

I have been a trainer for more than 5 years teaching various courses like Python,R, Scala,Statistics,Machine Learning,Hadoop and Apache Spark.

I have given Data Science online training for the last couple of years, trained a lot of professionals ranging from Students,Developers,Architects,Analysts and Project Leads from Companies like, GE, Genpact, Metlife-USA, HCL,Amazon,Bank of America,Microsoft,US-UK-Germany students.

Image title

I am happy to say that during all these years, all my students are 100% satisfied and working as Data Scientists without depending on others during interviews and jobs.

I dont teach in any institutes & all my Trainings are completely online !

Please find my Data Science Course details below:

Data Science Course details:

Mode: Online(Either through Zoom or GoTo Meeting)
Contact: tutordatascience@gmail.com (or) +91 8367299271

Timings:6:30AM to 8:30AM IST(Only weekdays)

1.Python Programming
2.R Programming

  • R Installation
  • R Studio Installation
  • Using a R notebook file
  • Programming using Vectors
  • Character data type
  • Attributes
  • Comparing single values against vectors
  • Indexing using logical data types
  • Performing arithmetic on vector
  • Vector recycling rule
  • Appending data to vector
  • Introduction to Matrices
  • Creating matrix
  • Vector & Matrix data types
  • Naming rows and columns
  • Finding dimensions of a matrix
  • Creating new columns and rows
  • Subsetting a matrix by element
  • Returning specific rows and columns from matrix
  • Sorting a matrix
  • Sorting and previewing data
  • Dataframes in R
  • Examining the internal structure of dataframes
  • Representing categorical values using factors
  • Selecting data by rows and columns
  • Selecting specific values
  • Using comparison operators to filter values
  • Combining conditions using logical operators
  • Sorting a dataframe in R
  • Lists in R
  • Naming lists
  • Adding values to a list
  • Indexing a list
  • Changing values in a list
  • Merging lists
  • R control structures
  • If & else statements
  • For loops
  • Adding results of loop to an object
  • Using if else within for loop
  • Using while loop
  • Introduction to functions
  • Nested functions
  • Adding control structure to a function
  • Apply functions in R
  • Using lapply with custom functions
  • Using sapply over built in functions
  • Using sapply over custom functions
  • Using vapply to control returned values
  • Using tapply on dataframes and matrices
  • R Strings & Dates
  • Concatenating strings in R
  • Updating column in a Dataframe
  • Extracting a substring
  • Difference between strsplit and paste()
  • Replacing value in a string
  • Removing whitespaces from string
  • Extracting parts of a date
  • Creating a new column in dataframe
  • Guided project using R


  • Understanding Numpy ndarrays
  • Selecting and slicing rows and items from ndarrays
  • Selecting columns and custom slicing ndarrays
  • Vector math
  • Arithmetic numpy functions
  • Calculating statistics for 1-d ndarrays
  • Calculating statistics for 2-d ndarrays
  • Adding rows and columns to ndarrays
  • Sorting ndarrays
  • Numpy Boolean arrays
  • Boolean indexing with 1-d ndarrays
  • Boolean indexing with 2-d ndarrays
  • Assigning values in ndarrays
  • Assignment using boolean arrays
  • Two guided projects with Numpy


  • Introducing dataframes
  • Selecting columns from a dataframe by label, using loc method
  • Column selection shortcuts
  • Pandas Series
  • Selecting items from a series by label
  • Selecting rows from a dataframe by label
  • Series and dataframe describe methods
  • Other data exploration methods
  • Assignment with Pandas
  • using boolean arrays to assign values
  • Guided project 1 with Pandas
  • Exploring data with Pandas
  • Using iloc to select by integer position
  • Reading csv files with Pandas
  • Working with integer labels
  • Using Pandas methods to create boolean masks
  • Using boolean operators
  • Pandas index alignment
  • Using loops in Pandas
  • Guided project 2 with Pandas

5.Data Cleaning with Pandas

  • Cleaning column names
  • Converting string columns to numeric
  • Practise converting string columns to numeric
  • Extracting values from the start of strings
  • Extracting values from the end of strings
  • Correcting bad values
  • Dropping missing values
  • Filling missing values
  • Coding challenge
  • Reordering columns and exporting clean data
  • Guided project on Data Cleaning


7.Probability and Statistics
8.Machine Learning

10.Linear Algebra

11.Linear Regression
12.Decision Trees

Contact Us:

Image titleEmail : honingds01@gmail.com

Website : https://honingds.com

Python for data science

Generating random numbers using the Python standard library

The Python standard library provides a module called random, which contains a set of functions for generating random numbers.

The Python random module uses a popular and robust pseudo random number generator called the Mersenne Twister.

Let us now look at the process in detail.

Python for data science


Firstly, we need to understand why we need to call the seed() function for generating random number.

Let us try to generate random number without calling seed() function. Then we will know the impact of using seed().

The random() function without calling seed(Python for data science) returns a random float in the interval [0.0, 1.0). Each subsequent call to random() will generate a new float value!!


If we call seed() function before calling random(), the chain of calls after random.seed(Python for data science) will produce the same trail of data:

The example below demonstrates seeding the pseudorandom number generator, generates some random numbers, and shows that reseeding the generator will result in the same sequence of numbers being generated.


Running the example seeds the Python for data science generator with the value 4, generates 3 random numbers, reseeds the generator, and shows that the same three random numbers are generated.

Notice the repetition of “random” numbers. The sequence of random numbers becomes deterministic, or completely determined by the seed value, 4

It can be useful to control the random output by setting the seed to some value to ensure that your code produces the same result each time.


Visite : Python for data science