Getting started with Pandas using Docker

The tool I use most as a data analyst is python pandas. It works with all data formats, sql, and cloud solutions like aws/google/microsoft. I think the biggest challenge I have recommending it to friends and colleagues is that it’s pretty intimidating. Most analysts have some experience with programming using excel, macros, or VBA.

But it is a huge jump from being able to understand the logic behind programming and getting something to a programming language and their dependencies working on your computer. I’ll show you a stable way to get it on your computer. How to automate calculations between columns, clean dates, and generate a pivot table without opening excel.

Getting the right python version, packages, and system variables can be a huge frustration getting started. To simplify this we are going to use a docker. Seems pretty crazy to add another piece of complicated technology to get the first complex technology to work. But it will be easier to get started with a couple of commands trying to install it from scratch.

There also other ways use pandas like: google colab, aws sagemaker, and the jupyter team have made docker “stacks” themselves.

Getting Started

Step 0: Download git (Git – Installing Git) and docker (Docker Desktop for Mac and Windows). They might take a few mins to download. This works on both Apple and PC.

On windows, you have to make sure you’re running the most up to date version. Also need to have WSL installed, but that should come with docker.

Step 1: Open terminal / wsl (windows)

Step 2: Change the directory to your desktop. Type the following.

cd ~/Desktop/

Step 3: Download my repo on github.

git clone https://github.com/kylepierce/pandas-docker-example.git

Step 4: Go into the repo directory

cd pandas-docker-example

Step 5: Type in following command.

docker-compose up

Docker compose should come with the docker for mac or windows desktop app. If you get an error you can make sure you have it installed by typing “docker-compose –help” now you should have a screen that looks like this

If you go to https://localhost:8888 you should see this.

Digging into the data

Now that we have the infrastructure out of the way we can focus on the programming side.

This is your “file system”. Right now there are two items. There is a data folder where CSV is located and there is a file called ‘calculate_metrics.ipynb’

If you open calculate_metrics.ipynb you’ll see this.

Notebooks are a great way to learn because you can run individual code blocks one at a time and see their output. If one breaks you’ll know pretty quickly.

Click on a Cell (hover and see a border appear around each cell) and hit the Run button at the top of the page. You’ll see a green asterisk spinning and then turn into 1 when the cell it’s complete. You just imported the Panda’s package!

The first thing we will do is to load our data into our notebook. I included a dummy ‘metrics.csv’ file in the ‘Data’ folder. Click on the 2nd cell and click run again and our CSV will be turned into a DataFrame. Dataframe is similar to a spreadsheet it gives more tools to manipulate data.

In the see your data section there are a few ways to view the data. Click run on each cell to see the output.

That not be too exciting for a small file, but if you’re working with a file has a million rows or 100’s of columns it might be nice to see their names and what type of data before working on it.

Now to get into more interesting things. Say we want to decide to hire another store salesperson or phone salesperson. We can calculate the conversion rate for in-store and sales by phone calls to help us make our decision.

Pretty simple! Now we can convert the date to the standard date YYYY-MM-DD and find each day’s day of the week (Monday, Tuesday…).

Finally put the new day of the week day into a pivot table to see which days have the highest average conversion % for in store and over the phone. The last cell will save the pivot table as a csv.

This might seem like a lot of work to generate a pivot table. But what if this csv was sent every day? or you had to do this for hundreds of stores?

This is just a small part of what you can do with pandas. If you want to learn more there are great youtube videos and some great books to get started with pandas. Hopefully to can see the power of the tool and a simple ecosystem to start testing!

If you have any comments or questions let me know in the comments!

Startup Weekend Guide

This guide is to help people who have never attended Startup Weekend or want to get the most out of their weekend. Startup Weekend is a great opportunity to start their idea or join a startup, while meeting local entrepreneurs, designers, and developers.

Before the weekend
Make sure to have business cards printed off before the event. If you are reading this on Thursday night or only have a few hours you can print business cards online from staples and pick same day pick up. Its cliche to say you need business cards but there will be between 30-100 people at this event. You most likely won’t be able to really interact with all of them but you will want an easy for them to connect with you after the event.

If you have an idea:
Write 300 words describing your idea
List your skills (business, marketing, design, programming)
List your needed skills
Create a name. Dont spend to long on this but they will ask what your idea is called. A lot of people just wing it but it will look better if you have a name.
Cut the 300 words down to 100 words
If you are serious about pitching it would be a good idea to do it a few times in front of a mirror or video camera and focus on telling a story about why you have your idea and why it will be a great startup.

If you dont have an idea
Look on twitter / Facebook / Meetup to reach out to others going to going to the event.
If you are developer or designer go around looking for assets that would be useful. Maybe a boilerplate that you would like to try or some free design assets on dribbble.
Write down some goals for the event. If you are looking for a new job you could write “speak with 3 startups” or “make 10 meaningful connections”

Friday:

Find out when your Startup Weekend starts and try to get there early. Most events offer food and water before the event starts this is a great way to network with other entrepreneurs.
While you are networking make sure to talk with a few people with programmer, business and designer name tags. Finding people that have technical and design skills especially important over the weekend. If you see that there are ton of programmers and only a handful of designers, or vice versa, make sure to speak with them. You might not be working with them but you might ask them for help over the course of the weekend.

If you are pitching it would be wise to try to be one of the first people to pitch. People are more likely to: remember you, be engaged, and write your idea down. After you have pitched you should sit down, listen, and take notes on other peoples ideas incase your idea doesn’t get support. You can always build your idea in the future with people you meet at this event, but take this opportunity to build a successful idea with others. Once everyone has pitched they will give everyone who pitched a large sheet of paper with your idea name on it. Then they will ask you to put that sheet of paper on the wall.

This is where you need to be active on recruiting people for your team and this is where the networking before the event really becomes important. People will walk around and ask you about your idea. Try to make the issue or idea relatable to them. Keep a look out for people you need on your team. For example: if you have business skills, but need a graphic designer and someone who programming experience you should keep a look out for people with these skills. However, just because someone doesn’t have a skill set that you need doesn’t mean you shouldn’t engage and recruit others. Skills like sales, marketing, writing, and speaking that are extremely helpful. As people walk around they will vote for their favorite ideas, they might vote for yours even if they aren’t joining your group.

The top 7-10 ideas will be chosen by the number of votes they get, these are the ideas people are going to create teams. If your idea is voted in those top positions then you need to use your previous networking sessions to actively recruit the people you want, or enlist people who are passion about your idea as well to get the people needed. Dont wait for people to come to you. If your idea did not receive enough votes you can still work on your idea if you have at least one other member. You can also close off the number of people you want to create a smaller team. Just find a group (1-4 people) that you think can complete the job and are motivated. Having a group larger than 5 can become overwhelming because there might not be enough work and some of the group might be distracted.

If you are not pitching listen to every idea and write down description of people and their ideas. This way when you are looking for someone later about to talk to them about their idea you can find them and introduce yourself. Its easier to remember that someone is wearing a blue polo shirt and has an idea about changing how teams work together that remembering the name they label for it. You will have a chance to vote on your favorite ideas with post it notes. Take time to walk around and talk more about specific ideas and talk to people who are in need of skill you posses. Vote for ideas you think are most likely to succeed and find an idea that you are interested in and can help. Skills in sales or experience can be just as important as graphic designers or programmers depending on the idea.

After teams teams are formed they let you go to separate meeting spaces to get started creating the idea in your teams. This is a great time to build a mind map of what you want to accomplish in the weekend. You should spend an hour mapping out what this idea could be. Once you have created this map try to find a few crucial pieces of the idea that will add the most value. It is important to create the things that add the most potential value first. When you pitch to the judges you can showcase other ideas that you want to create in the future. Depending on how hard the idea is you can start designing and programing on Friday night or you can end around midnight and get ready for saturday.

Saturday:

Most startup weekend events start early around 8-9am, again if the idea will take a lot of time to design and develop it might be good to get there early. On your way to the morning event it might be a good idea to stop at the store and grab some red bull and snacks. These are great to have around when you are tired or can be used to ask for a favor. Saturday is the only full day of the weekend spent working. You should use this by going out and talking to potential clients outside the ecosystem of Startup Weekend. Its important to get customer validation this will look really good to the judges in the final pitch, but it will also help focus your minimum value product (MVP) to create a product that customers will actually use.

Try to spend 2-3 hours getting feedback from potential customers. (This article doesn’t cover validation but here are a few articles that might help.) Once you have finalized you most important features or value adding parts to your idea its time to get to work. If you have people on your team that can do all the parts of the project that you need you are in great shape and should be able to get a lot of work done during the weekend. If you are missing key skills to complete the job dont worry it is better to have a validated idea that solves a problem that to have a fully functional application.

In my opinion there are two ways to win:
1) Design a product/idea that is validated and created an amazing presentation.
– or –
2) Build an entire product/idea that is validated with a full demo presentation.

At 5pm on Saturday you should stop adding any changes to the idea or product. That means no more features or “wouldn’t it be cool…” statements. You should focus on getting done what you have figured out to that point. Its very easy to spend hours brainstorming the perfect solution to the problem, but you have a time constraint. If you are able to get the product or idea made then you can go back and add more features.

Tools:
inVision – Rapid prototyping website that lets you upload images and make them interactive. This is perfect if you dont have a coder or you want to prototype before building something
Meteor.js – Javascript framework that allows for rapid prototyping. I learned how to code and built an iOS and Android app with it. If you have an experienced/intermediate developer on your team they should be able to pick it up.
– Keynote – I recommend using Keynote for your presentation. It has a bunch of animation features that are easy to use and will really impress the judges. Even if you have never used keynote before you will be surprised how quickly you can make a professional looking presentation.

Sunday:
Most of the product/idea should be built or clearly described. If you can’t build the technical aspect you should focus on visualizing the end product. The judges understand that you might not have the skills or the time to build a fully functioning product. You should spend most of Sunday working on perfecting you pitch. Hopefully you have a designer that can make slides and product mockups. If you have a larger group you will be able to continue to work on the product while working on the pitch. If you have a smaller group you should focus on getting the pitch made than building the product. Finally, I recommend only having one speaker because the transitioning time to multiple speakers usually doesn’t work.

I find it most useful to build the slides, write the entire pitch, then put it into bullet points. Your process might include making bullet points before writing the entire pitch, but I recommend bullet points as the final result because it is easy to remember 3-5 bullet points for a slide than 2 paragraphs. I dont recommend trying to remember the pitch verbatim and HIGHLY suggest not reading from a piece of paper.

During the pitch remember to breathe and speak slowly.

After the weekend:
If you startup won or didn’t win you should continue with the idea. From my experience most group members will be really excited to continue working on it but that interest will disappear after 2 weeks. That shouldn’t stop you from continuing to work on the startup.

If you have any suggestions or advice leave a comment below!