OSEMN Framework

OSEMN Framework

The OSEMN Framework is a process in data science to collect, clean, explore, model, and run analysis. The process involves five steps that is hidden in the acronym OSEMN, Obtain Data, Scrub Data, Explore Data, Modelling Data and Interpreting Data.

The 5 Steps of the OSEMN Framework

1. Obtain Data

The first step of any data science project is the process of obtaining data. The process is somewhat self explanatory, but let's dive a little deeper, as obtaining data can be done in many different ways.

Internal data is probably the most common source of data. It's data collected and stored by your own organization. To obtain and work with such data would require the knowledge of a programming language like Python, R or similar. The data is also most commonly stored in a relational database like MySQL or PostgreSQL. Which require you to have the knowledge of those types of databases and a query language like SQL or some sort of ORM. In some cases data is stored in other types of databases such as NoSQL or Graph databases. Where MongoDB, CouchDB or Neo4j is common databases, all which require their own set of skills. If you working with larger datasets (Big Data) tools like Hadoop and Spark might come in handy.

Another way to obtain data is through external source via some sort of Web API. That requires you to have the knowledge of a programming language to fetch that data while also have a broad understanding of JSON or XML which are human and machine readable language.

Manual collection of data is the most tedious type of data obtainment. However in many cases it's necessary and usually the first step to get real world data into software, applications or data science projects. The manual collection of data usually require the understanding of spreadsheets, CSV (Comma Separated Value) and similar.

2. Scrub Data

3. Explore Data

4. Model Data

5. Interpret Data

What is OSEMN Framework used for?

In data science it's important that the data handled the right way to achieve a successful analysis. The OSEMN Framework is used by data scientists to remember the process of collecting, cleaning, exploring, modelling and interpreting data.

New OSEMN Framework Tutorials
View all
New OSEMN Framework Podcasts
View all
New OSEMN Framework Videos
View all
New OSEMN Framework Questions
View all
New OSEMN Framework Books
View all
New OSEMN Framework Courses
View all