Log of what of Solea Miller has learned at Techie Youth

Fri. Aug. 12, 2022

Techie Youth: Random Forest & Machine Learning Competitions Practice

Today is my last day working with Techie Youth, and I learned about random forest models and practiced for machine learning competitions. Random forest uses multiple decision trees and takes the averages of the predictions that the trees make. It works well with default parameters and has better predictive accuracy. In code it is imported from sklearn.ensemble and uses the same format as DecisionTreeRegressor.

RandomForestRegressor(random_state = 1)

Next, I did the the practice for machine learning competitions. The practice put all of the components I learned in the previous units together, and added random forest. It started from rerunning the code I wrote in previous exercises, then training and fitting the random forest model on the training data, going back to filepaths and reading them for the test data, then submission.

Although my SYEP work is done, I still plan on finishing the Techie Youth coursework. The next step is the Kaggle course on Intermediate Machine Learning and Pandas, but I won't be moving on without reviewing everything I learned and organizing the notes I took. I had a lot of fun learning about data science and ML here, and I am going to continue doing so on my own.

Time: 5hrs

Thu. Aug. 11, 2022

Techie Youth: Model Validation, Underfitting & Overfitting

Yesterday, I learned about model validation, today I learned about underfitting and overfitting.

Tue. Aug. 9, 2022

Techie Youth: Data Exploration & Machine Learning Models

Yesterday, I learned about models and data exploration, and today I learned how to use machine learning models.

Machine learning uses patterns from past scenarios to make predictions about new scenarios. One popular model is the Decision Tree, which is a model of decisions and their possible outcomes. When you capture patterns from the data it's called fitting or training the model. The fit data is applied to new data to predict outcomes. Predictions are called leaves, and are represented at the bottom of the tree.

In data exploration, the pandas library is used to get to know the data. the most important part of pandas is the Dataframe, which holds the data you think of as a table. It starts with saving the filepath to a variable, reading the data and storing it in a Dataframe (with pd.read_csv), then printing a summary of the data (.describe( )). From there you can see a table of data that includes: count, mean, standard deviation, max, min, as well as 25th, 50th, and 75th percentiles.

In an ML model you can:

.columns - to see a list of all columns in a dataset

.dropna - to drop missing values

pull out a variable with the dot-notation

select features using a list of column names in string form

Models are created using scikit-learn (sklearn). The model is defined, then a number is specified for random_state so you get the same results. The model is then fitted using .fit .

Time:

Yesterday - 5hrs

Today - 5hrs

Thu. Aug. 4, 2022

Techie Youth: Loops , Strings & Dictionaries

Yesterday I learned about loops, while today I learned about strings and dictionaries.

Loops are a way to repeatedly execute code. 'for' loops specify the variable name to use and the set of values to loop over. 'while' loops iterate until a condition is met. Its argument is evaluated as a boolean statement.

Strings can be used with double or single quotes, but Python will get confused by single quote characters. This can be avoided by using a backslash before them. Python can create a new line using \n, or using triple quotes and the enter key. 'print()' does it automatically unless you specify a value for 'end' other than \n.

For example:

print("hello", end= ' ')

print("world", end= ' ')

would return

helloworld

Time:

Yesterday - 5hrs

Today - 5hrs

Tue. Aug. 2, 2022

Techie Youth: Conditions & Conditional Statements(Kaggle)

Today I worked on:

Conditions & Conditional Statements -

Conditions are statements that are either true or false, and the most common ones compare different values. Variables can be compared. Some common symbols used to create conditions are:

(==) equals

!= does not equal

< less than

<= or equal to

> greater than

>= or equal to

A single = assigns a value, while a == checks if two things are equal.

Conditional statements change how a function runs. Some conditional statements include: if, else, and elif. 'if' a condition is true the function preforms an action. If a condition is false and you use an 'else' statement, the function will perform the action under the else statement. if the first statement is false, you can use 'elif' as a second 'if' statement.

Lists - A review on lists and subsetting them

This completed the Kaggle - Intro to Programming course. After that, I went through their intro to Python course and reviewed: syntax, functions including help( ), booleans & conditionals, and lists.

Time: 5hrs

Mon. Aug. 1, 2022

Techie Youth: Functions & Data Types (Kaggle)

Today I reviewed data types, learned more about functions and how to write them.

Functions have 6 parts: the header, name, argument, parentheses, body, and return statement.

1. header - defines the name of the function and its argument(s)

[def]

2. name - name of the function

get_expected_cost

3. argument - name of function input variable

(beds, baths):

4. parenthesis - the parentheses enclosing the function argument must be followed by a colon

( ):

5. body - specifies the work that the function does (every line must be indented 4 spaces [tab])

value = 80000 + (30000 * beds) + (10000 * baths)

6. return statement - final line of code, returns the output value

return value

EX: def get_expected_cost(beds, baths):

value = 80000 + (30000 * beds) + (10000 * baths)

return value

I also reviewed data types. The types are: strings, floats, integers, and booleans.

Fri. Jul. 29, 2022

Techie Youth: Titanic Dataset (Kaggle)

Today I worked on Kaggle's Titanic Dataset. After opening up a notebook in Kaggle, the coding environment allows you to create boxes to type your code in. The output appears below the box in black text. I imported the train data and test data and ran the code to see if each would display. After that, I moved onto finding the percentages of men and women who survived. At first, it produced an error when i ran the code, but then I found out that it worked when I ran all of the code instead of just the portion I typed in. The tutorial ended with the creation of a random forest model, and I went back to completing Kaggle's intro to python course.

Time: 5hrs

Thu. Jul. 28, 2022

Techie Youth: Exploring Kaggle

Today I finished the iris dataset tutorial and started exploring Kaggle. After finishing the tutorial, I understood some parts of it, like creating and subsetting the arrays of the dataset and using functions like .shape. However, I felt that my understanding of what I was really doing was very surface level. This continued when I went onto Kaggle and joined a competition. I was unsure of what exactly I was supposed to do. This came to an end when I looked around some more and saw Kaggle's Python course. It first reintroduced some basic Python concepts like printing and calculations, then it gave a tutorial on how to do Kaggle competitions and the skills needed to compete in them. It used its Titanic competition (using the Titanic passenger data to try to predict who will survive and who will die) to explain how to use the site and its coding environment. I plan on continuing it tomorrow and over the weekend to see how much I can learn from it.

Wed. Jul. 27, 2022

Techie Youth: Iris Dataset

Today I figured out how Anaconda works with Python. I spent a long time trying to figure out how to connect Anaconda to the Python software itself, and didn't realize that all I had to was open Anaconda Prompt (which I didn't realize was already installed on my computer). From that point on it was smooth sailing. I learned how to create different environments using 'conda create --name environment_name' and add packages onto it. I created an environment named iris and added scikit-learn, numpy, scipy, pandas, and matplotlib. I then followed the tutorial and loaded the sklean, pandas, and matplotlib libraries it asked for. Next, I loaded the dataset, printed its shape (150, 5), and printed 20 rows using .head. After that, I ran code that printed out the lengths and widths of different parts of the irises and grouped them by class. The code then created a box and whisker plot, histogram, and a scatterplot using pyplot.show.

Time: 5hrs

Tue. Jul. 26, 2022

Techie Youth: Working With Anaconda

Yesterday, I figured out how to download Python and Anaconda and read up on how to import packages and the different machine learning projects I can start. One of them is the iris flowers dataset, which is the 'hello world' of machine learning and statistics

Today I spent time trying to figure out how to get Anaconda to work with Python. I searched for tutorial videos and articles on the Anaconda website, Python site, and on other miscellaneous websites, but I'm still unsure. Since I was unsuccessful in that, I decided to take a break from it and explore Kaggle. I made an account yesterday, and today I started looking for a competition to enter later. One that piqued my interest was a Strip AI for the Mayo Clinic. The goal was to create an image classification AI for the origin of blood clots in strokes. Tomorrow I hope I can figure out how Anaconda works with Python, and start working on the iris flowers dataset project.

Time:

Yesterday - 5hrs

Today - 5hrs

Fri. Jul. 22, 2022

Techie Youth: Final Datacamp Exercise

Today I finally finished Datacamp's intro to Python course. I struggled with the final exercise of the course. I had no problem creating NumPy arrays or finding the median, but I got stuck when it asked me to use np_positions == 'GK' as an index for np_heights. I reviewed all the videos and exercises in the unit to see if I missed anything, and went searching on a bunch of websites on the same topic of NumPy arrays and subsetting them, but it didn't really help me. I tried plugging in different variations of what I thought the answer was, and then eventually I figured out that I was formatting it wrong. I was only putting 'GK' in the brackets instead of the entire thing, so I had the right idea but the wrong execution. It was such a relief to enter the code and have it finally run properly. I can't wait to start learning about machine learning projects next week.

Time: 5hrs

Thu. Jul. 21, 2022

Techie Youth: Basic Statistics

What I learned today at Techie Youth was basic statistics with NumPy. You can use different functions in NumPy to find out more information on your data. Using the np.mean( ) and np.median( ) functions you can find the mean and median of an array. You can also find the standard deviation and the correlation between two data sets with the functions np.std( ) and np.corrcoef( ). I got more practice with creating and subsetting 2D arrays, but got stuck on the final practice problem. Tomorrow I hope to finish the final activity, and start working on some projects in Python.

Time: 5hrs

Wed. Jul. 20, 2022

Techie Youth: More Work With NumPy Arrays

Today I learned about 2D arrays and how to subset them. Subsetting arrays is just like subsetting lists. You use the same square brackets and use the index values starting from 0 to refer to the list element you need. 2D arrays can be subsetted in the same way. They are created from lists of lists, and with the .shape attribute, you can find out how many rows and columns exist in an array.

EX:

  1. Print out the shape of np_baseball

print(np_baseball.shape)

output: (4, 2)

4 rows, 2 columns

np_baseball =

[[180. 78.4]

[215. 102.7]

[210. 98.5]

[188. 75.2]]

Subsetting:

To select an element from the row, you would put the corresponding index number of 0 - 3 inside brackets. With the row you would put the corresponding index of 0 - 1. For example, to select the array element 98.5, you would type:

np_baseball[2] [1] or np_baseball[2, 1]

To select the entire row or column you would use :

Time: 5hrs

Tue. Jul. 19, 2022

Techie Youth: Using NumPy Arrays & Other Functions

Today I did more work with NumPy arrays on Datacamp. To import NumPy and make it easier to refer to, you would use 'import' and use 'as' to refer to it by a different name.

EX: import numpy as np

After importing it as np, you can now use it in this format:

EX: np.array( )

You can then perform calculations with the array

EX:

  1. height_in is available as a regular list
  1. Import numpy

import numpy as np

  1. Create a numpy array from height_in: np_height_in

np_height_in = np.array(height_in)

  1. Print out np_height_in

print(np_height_in)

  1. Convert np_height_in to m: np_height_m

np_height_m = np_height_in * 0.0254

  1. Print np_height_m

print(np_height_m)

The output would be the list of heights in inches and the list of heights in meters:

[74 74 72 ... 75 75 73]

[1.8796 1.8796 1.8288 ... 1.905 1.905 1.8542]

The arrays can be subsetted just like regular lists. In this case, the variable 'light' holds the bmi of all the baseball players with a bmi under 21:

  1. Calculate the BMI: bmi

np_height_m = np.array(height_in) * 0.0254

np_weight_kg = np.array(weight_lb) * 0.453592

bmi = np_weight_kg / np_height_m ** 2

  1. Create the light array

light = bmi < 21

  1. Print out light

print(light)

  1. Print out BMIs of all baseball players whose BMI is below 21

print(bmi[light])

output:

[False False False ... False False False]

[20.54255679 20.54255679 20.69282047 20.69282047 20.34343189 20.34343189

20.69282047 20.15883472 19.4984471 20.69282047 20.9205219 ]

Printing 'light' by itself returns a boolean list that shows True or False based on the parameter bmi < 21. printing it as a subset of a list/ array prints every bmi value under 21.

Time: 5hrs

Mon. Jul. 18, 2022

Techie Youth: Working With Packages

Today is Monday, the start of week three at Techie Youth. I learned about what packages are, and their many uses. Packages are directories of Python scripts that exist to expedite the coding process. Each script is a module and they specify functions methods and types. There are thousands of them available for use. Some examples are: NumPy (numerical programming), MatPlotLib (visualizations) and scikit-learn (machine learning). An example of their usefulness is the NumPy array. Since Python cannot do calculations in lists, the NumPy array can be used to do that. It can also use the > sign on a list of booleans to return a true or false list based on how each value meets the parameters. However, the arrays can only contain one type. I hope to learn more about packages tomorrow.

Time: 5hrs

Fri. Jul. 15, 2022

Techie Youth: Functions, Help & Methods

Today is the end of week 2 at Techie Youth, and today I worked with functions, help, and methods. I decided to test out working from an iPad today, and both the Datacamp and Techie Youth website functioned correctly. The only thing that worked differently is that some of my hours did not log. Other than that, today I started off the day learning about functions. Python has built in functions to make life easier. Some familiar functions are:

print(): Used to print a message to the screen

type() Used to find the type of text extracted

str, int, bool, float: switch between data types

The new functions I learned about were:

len: finds the length of a string

help: gives information about other functions

max: returns the largest value item

pow: takes three arguments: base, exp, and mod. base and exp are required arguments, mod is an optional argument.

sorted: sorts a list in descending or ascending order

upper: returns a string where all characters are uppercase

I also learned about methods today. Methods are functions for specific objects based on their type. When I was doing the coding exercises on Datacamp I got stuck because I was using the wrong syntax for them.

What I did wrong: upper(variable)

The correct answer: variable.upper( )

I tried to put the variable in the brackets instead of before the period, which resulted in an error.

I hope to learn more about Python next week.

Time: 5hrs

Thu. Jul. 14, 2022

Techie Youth: Working With Lists in Python

Today is the second to last day of week 2 at Techie Youth, and today I focused of lists. Picking up from where I left off, I continued working with subsets of lists in Python. It confused me at first, because the values for the list items started from 0 instead of 1, but I got it after working with them for a little while. The problems moved on to calculations with lists and then slicing and dicing. To slice a list is to select multiple elements from that list. To do that, you would use the same subset command except put the start number and end number between the brackets, separated by a colon.

Ex: areas = ["hallway", 11.25, "kitchen", 18.0, "living room", 20.0, "bedroom", 10.75, "bathroom", 9.50]

areas[0:6]

If I were to print this, the output would be a list that goes from 'hallway' to the living room area, 20.0. The end index is not included so you have to go one value higher.

You can subset lists of lists by adding more brackets, replace list elements by selecting an index's value and equating it to something different, extend a list by using the + operator to add a list to a list, and delete list elements by using the del() statement.

Time: 5hrs

Wed. Jul. 13, 2022

Python: Variables, Types, Lists & Subsets

Today, I continued Datacamp's course on introductory Python. I reviewed variables and types, and learned about lists and subsets. Variables are names that store values. You can calculate using variables, as long as you define the variable before using it. Some of the most common variable types are:

int: A type that represents integers, or numbers without a fractional part.

float: A type that represents numbers with a fractional part.

str: A type to represent text. Can be used with single or double quotes.

bool: Named after George Boole, booleans are a data type that represent logical values and can only be of 'True' or 'False'.

If you want to do calculations with different variable types or print them, you have to convert the types to one of the types you're trying to use. ex: [str()]

In lists, you don't have to convert variable types. Lists can hold a set of variables no matter the type. Lists can also hold lists. An example of a list:

hall = 11.25

kit = 18.0

liv = 20.0

bed = 10.75

bath = 9.50

areas = ["hallway", hall, "kitchen", kit, "living room", liv, "bedroom", bed, "bathroom", bath]

To select specific components of a list, you can subset it by naming the list and typing the number of the variable in the list (starts from 0) inside square brackets.

Ex: print(areas[1])

This would print the value of the variable 'hall' which is 11.25.

I can't wait to learn more tomorrow.

Time: 5hrs

Tue. Jul. 12, 2022

Algorithms & Python Basics

Today, I learned about algorithms and started to review Python basics. Algorithms are a set of steps designed to accomplish a task, and a good one does so efficiently. To understand how algorithms work, you have to start thinking algorithmically. Algorithmic thinking is a more systematic way of thinking through problems that is similar to how a computer is run. You have to clearly define the problem, break it down into smaller parts, define the solution for each of those parts, implement the solution, and then make it efficient. Some simple algorithms in our daily lives are: sorting a list of numbers or baking a cake. Some of the types of algorithms are searching algorithms and sorting algorithms.

I also started the unit on Python today. I have worked with Python in the past, and today’s unit was a good review of what I learned previously. It went over the print function and using Python for calculations. I can’t wait to learn more.

Time: 5hrs

Mon. Jul. 11, 2022

Introductory Completion & Starting the AI Unit

Today is the start of my second week at Techie Youth, and I'm proud to say that I finally completed both of the introductory units. I learned about habits, mental health, mentorships and role models. Habits take 21 days to develop and 90 days to become a lifestyle change. Developing positive habits and starting new activities can help your physical and mental health. You can volunteer and give to others, cut sugary foods out of your diet, exercise regularly, meditate, do yoga, spend time with others, learn new things, and sleep.

The ideal role model is authentic, has flaws that they aren't ashamed of, are inspirational, and have positive ethics that you agree with. They are someone you look up to and want to emulate. However, you may not have a relationship with your role model. Mentorships are more personal. You have a mentor who guides you through whatever subject you are interested in. In a good mentorship, you show what you can offer and what you know about the topic, do your own research, and the mentor guides you and answers any specific questions you may ask. It is important to respect their time and be open minded.

I also started the AI unit today. AI is the practice of computer recognition, reasoning, and action. I learned about the 6 major branches of AI: machine learning, neural networks, robotics, expert systems, fuzzy logic, and natural language processing.

I am excited to learn more about AI tomorrow.

Time:5hrs

Fri. Jul. 8, 2022

Techie Youth: Scams, Networking

Today is day 4 of working with Techie Youth, and today I finished the first introductory unit. I learned about scams, networking, and productivity. Scams come when you're vulnerable or desperately need something. They flaunt connections with huge businesses and organizations, when in reality they haven't worked there a day in their lives. Common red flags of scams are instantly asking for money, unfair work contracts, unreasonably high salaries or opportunities, and jobs with no contract or qualifications.

Networking can be nerve wracking, but it's very beneficial in the long run. Your goal is to create meaningful connections with others. You should start with small talk, ask about what the other person does, then ask for their business card. You should be offering up your services and seeing what you can do to help the other person. The best places to network are at work, school, cultural events, professional meetings, job fairs, and volunteer opportunities. You should network all the time and with a variety of people.

Time: 7hrs

Thu. Jul. 7, 2022

Techie Youth: Remote Work & Interviews

Today is day 3 of working with Techie Youth, and here is what I learned today:

(My previous answer was longer, but did not save after I submitted it.)

Time: 6hrs

Wed. Jul. 6, 2022

Techie Youth: LinkedIn, Resumes, and Cover Letters

Today's day 2 of working with Techie Youth, and I feel like I accomplished a lot today. I learned how to make a LinkedIn account, create a cover letter, and create a resume. The ideal LinkedIn page starts off with a clear picture of you with good lighting, facing the camera, and a simple background. It must have your job title as the headline, with keywords and a line that grabs the attention of whoever is viewing your page. Your contacts go next to the headline, and then you move onto your profile summary. It should include what problems you can help solve. Next is the work experience and recommendations. Since I haven't had a job before, I put my volunteer work at an art gallery. The process of making my LinkedIn was very time-consuming because I had to go search for information I didn't have on hand; an example being the date I started high school. The time I spent on it got marked as inactivity as a result.

The next topic I focused on today was making a cover letter. I already had a resume, so I edited it and spent more time on my cover letter. The goal of the cover letter is to be a brief introduction of yourself and your skills to the hiring manager. In mine, I included the Python class I took in high school, as well as my digital and traditional art skills. I want to learn more about Python and other programming languages so I can add them as additional experience with computer science. Overall, I learned a lot about the job application process today, and I hope to learn more tomorrow.

Time: 7hrs

Addition:

The log of what I did yesterday did not save, so here's a brief recap:

I was introduced to Techie Youth, I learned how to write a check and how to choose a checking account, how to endorse a check, and how to choose your workplace (good ethics, negotiate wages).

Time: 5hrs