Your Personal Database (PostgreSQL)

I was reading Choose FI: Your Blueprint to Financial Independence and one of the chapters concluded with a question like:
“What would you do if you didn’t have to work?”

Something rose to the surface. Even if I didn’t need to work to earn money, I would still practice data analysis using SQL.

This awakened my desire to set up a SQL server-database for personal use. Back-end database access where I can write queries. I miss this dearly from my previous job, where I had an in-house electronic record system and superuser access. I’ve tasted the forbidden fruit and cannot go back to measly front-end, web-browser button clicking to configure reports with limited functionality and flexibility. The power of back-end querying is what I seek, but this is challenging when my company doesn’t currently have a database. Setting one up is notoriously hard, even for professional developers.

I emerged through some struggles to set up a personal SQL database so I can practice queries with my own data. I like the IDE called Datagrip by Jetbrains (free with a student email address) and PostgreSQL (also free) which is what I used in the previous job. Here’s how to set it up.

Step 1: Download PostgreSQL
It’s free.
https://www.postgresql.org/download/

Step 2: Install PostgreSQL and set up postgres User Password and Port.

The super user credential will be used to set up the database connection in the IDE.
Username is postgres (by default). You define the Password.

The default port of 5432 worked for me and should work for most people.

Step 3: Complete PostgreSQL Installation. Restart computer to apply downloaded updates.

Step 4: Download and set up DataGrip.
It’s free with a student email account. There are other free IDEs such as DBeaver too.
https://www.jetbrains.com/help/datagrip/postgresql.html

Step 5: Set up the database in DataGrip.
In the “Database” pane on top left, click the + icon > new Data Source > PostgreSQL.

Give it a name. I called it Personal Postgres.

Use localhost, port 5432, and Authentication type as User & Password.
Enter the User: Postgres and the Password you defined in step 2. Choose your Save password preference (Forever is convenient for a personal computer).

Test the connection. If it works, then hit Apply and OK.

Note: If you get an error message like this, that means the PostgreSQL was not installed correctly (step 2).
You MUST use the username and password. The “No Auth” feature did not work for me.

Step 6: Savor the connection!
The database will take a few minutes to connect to an online server so that you can use PostgreSQL SQL functions. If you have very strict firewall settings on your computer, you might need to allow Windows firewall or similar to allow the 5432 port connection.

If everything is good, you’ll get a small Connected status update on the bottom right Event Log:


In a future post, I’ll share how to upload your first database table from a CSV file.

Happy querying!

Advertisement

Compounding Knowledge

It’s been one year since I started studying programming using Codecademy.com. I set out to study 4 to 5 times a week, every week, 1 lesson page at a time. My longest streak on record is 12 weeks in a row. I’ve completed 86% of the Learn Python 3 course (a hefty course that covers programming fundamentals) and finished the Command Line course too (Linux terminal is not so scary anymore!)

I just finished an online project called ‘Fending the Hacker’ where I read and write to CSV and JSON files programmatically with Python. I didn’t realize this till the end, but this project built on prior lesson topics:

  • Functions
  • Loops
  • Lists
  • Dictionaries
  • Modules (JSON, CSV)
  • Files – Programmatic File Reading/Writing

Looking back on what I’m comfortable with now and how much I’ve learned in one year amazes me. I don’t look back much nor often. But I recall a sinking, confused feeling about not understanding loops, when to use a function, and the purpose of lists and dictionaries. Now I can’t imagine doing any Python analysis or mini project without loops and lists at a minimum. I’m comfortable using them, something distinctly different from before.

This shows me the power of bite-sized but consistent practice. Most lesson topics are divided into about a dozen pages, and I do the reading and practice for 1-2 lesson pages each sitting. That’s 10 minutes or less of light and easy studying. I don’t let long stretches of days pass between each sitting. Recently I’ve shifted my Python study time to earlier in the day to ensure I get it done. I feel the power of compounding knowledge and love it. Is this what the power of compounding interest is also like? The journey along the way has actually been fun.

Onward to the next and final lesson of Python 3, Classes!

Test the Truth

The previous post on falsiness (which should be “falseness”, but will continue with the ‘i’ since “Truthiness” is the conceptual term instead of “Truthfulness) has me thinking and steam’s coming out of the engine. I wanted to see for myself these different flavors of False in action, as well as variants of Truthiness.

See the results for yourself running this code in a Python IDE. Experimenting with this made me discover {} is part of the falsiness group, too.

# Values for test: False, 0, None, [], {}

test = []

if test:
    print("True. Condition passed. If statement succeeded.")
else: print("False. Condition did not pass. If statement failed.")
>>> False. Condition did not pass. If statement failed.

test = [1]

if test:
    print("True. Condition passed. If statement succeeded.")
else: print("False. Condition did not pass. If statement failed.")
>>> True. Condition passed. If statement succeeded.

“Falseness”: False, None, 0 and [ ]

Here’s a lesson on “falseness”- that is, whether values are classified as True or False in Python.

I’m working on a Codecademy project (Abruptly Goblins) where there’s a gamer named Kimberly who is available to play on Monday, Tuesday and Friday. There will be other gamers added in later.

Let’s make a dictionary with name and availability as keys.
We’ll also make an empty list called gamers to store valid gamer details.

gamers = []
kimberly = {"name":"Kimberly Chook", "availability": ["Monday", "Tuesday", "Friday"]}

The project instructions say to:
Create a function called add_gamer that takes two parameters: gamer and gamers_list. The function should check that the argument passed to the gamer parameter has both "name" and a "availability" as keys and if so add gamer to gamers_list.

The number of times ‘gamer’ and variants are being tossed around make these instructions confusing as heck! But I plow through. Here’s what I came up with:

def add_gamer(gamer, gamers_list): #gamers_list is the parameter. gamers = [] is what the parameter value will be (argument).
    if gamer.get("name") and gamer.get("availability"): # Access name and avail values if they exist. If any keys not found, returns None.
        gamers_list.append(gamer)
    else: print("Failure, Gamer doesn't have name or availability")

Notice the if statement here. It seems incomplete to me:
if gamer.get(“name”) and gamer.get(“availability”):

We will be inserting gamer dictionary arguments. If it doesn’t contain “name” or “availability” as a key, the .get() method will return None (because it did not find the key, and thus has no corresponding value).

But there is something weird assumed here. If the gamer argument does contain keys of “name” and “availability”, the if statement is True, so proceed with the function (appending the player’s details to the gamer list).

1. Why do the two .get() statements result in a True / pass go, collect $200?

2. If any of the .get() statements results in a None, why is that a False / do not pass go, do not collect $200?

The answer to #1 is still unknown to me, but I did find out #2 from Stack Overflow:

The expression x or y evaluates to x if x is true, or y if x is false.

Note that “true” and “false” in the above sentence are talking about “truthiness”, not the fixed values True and False. Something that is “true” makes an if statement succeed; something that’s “false” makes it fail. “false” values include False, None, 0 and [] (an empty list).

from Stack Overflow https://stackoverflow.com/questions/3914667/false-or-none-vs-none-or-false

When any .get() statements results in a None, that is of the “False” category in Python so it will not proceed. I tested this out by running:

gamers = []

def add_gamer(gamer, gamers_list):
    if gamer.get("name") and gamer.get("availability"):
        gamers_list.append(gamer)
    else: print("Failure, Gamer doesn't have name or availability")

kimberly = {"notname":"Kimberly Chook", "availability": ["Monday", "Tuesday", "Friday"]}
print(kimberly.get("name" and gamer.get("avialability")))

>>> None

So when you ask your partner “Did you clean the bathroom yet?” and get no response for an answer (none, nada, nothing), you can interpret that as: status_bathroom_is_clean = False.

Ramsey’s First Eggs – Python Loop Regressions

I’ve been gathering data about my hens’ eggs, like how many eggs are laid per day and by whom. One of my baby hens ‘Ramsey’ started laying eggs on March 21st. I weighed the eggs each day and recorded the data. The weight appears to increase gradually over time.

DayEgg Weight (grams)
039
142
242
343
447
544
644
743
844
946
1050
1155

I experimented with creating a linear regression (y = mx + b) to find the line of best fit using Python. I plotted the data and could tell this was not linear, so then I constructed a quadratic regression (y = ax^2 + bx + c).

# Set up Quadratic Regression

def calculate_error(a, b, c, point):
  (x_point, y_point) = point
  y = a * x_point**2 + b*x_point + c # Quadratic
  distance = abs(y - y_point)
  return distance

def calculate_all_error(a, b, c, points):
  total_error = 0 # Set initial value before starting loop calculation

  for point in points:
    total_error += calculate_error(a, b, c, point)
  return total_error

I entered the egg weight data as a list (datapoints), and iterated over a range of a, b, and c values to find what combination of a, b, and c would give the smallest error possible (smallest absolute distance between the regression line and actual values). I set initial values of a, b, and c = 0 and smallest_error = infinity and updated (replaced) them each time the error value was smaller than before.

# Ramsey Egg Data
datapoints = [
  (0,39),
  (1,42),
  (2,42),
  (3,43),
  (4,47),
  (5,44),
  (6,44),
  (7,43),
  (8,44),
  (9,46),
  (10,50),
  (11,55)
]

a_list = list(range(80,100))
possible_as = [num * .001 for num in a_list] #your list comprehension here
b_list = list(range(-10,10))
possible_bs = [num * .001 for num in b_list] #your list comprehension here
c_list = list(range(400,440))
possible_cs = [num * .1 for num in c_list] #your list comprehension here

smallest_error = float("inf")
best_a = 0
best_b = 0
best_c = 0

for a in possible_as:
  for b in possible_bs:
    for c in possible_cs:
      loop_error_calc = calculate_all_error(a, b, c, datapoints)
      if loop_error_calc < smallest_error:
        best_a = a
        best_b = b
        best_c = c
        smallest_error = loop_error_calc

print(smallest_error, best_a, best_b, best_c)
print("y = ",best_a,"x^2 + ",best_b,"x + ", best_c)

Ultimately I got the following results:

y = 0.084 x^2 + -0.01 x + 41.7
Which gives a total error of 19.828.

This error feels big to me. I would like to get it as close to 0 as possible, or within single digits. One thing I may do is remove the data point of day 4, 47grams, which was unusually large.

I plotted the data in an Excel graph and added a quadratic regression line as well. The resulting regression line is y = 0.0972x2 – 0.1281x + 41.525. This is close to my Python quadratic regression, but not the same. I’d like to figure out why these differ when the model is similar. It believe this may have to do with formula of error calculation – I am using Total Absolute Error, whereas the more common standard is to get Mean Squared Error.

Note how the data points do not follow linear growth, hence quadratic time!

Stop immediately, not after

Something in that boggles me is why range(a, b) in Python includes the a value, but not b. In math, range(a, b) implies neither a nor b are included – the parentheses are exclusive. Square brackets – range[a, b] – are inclusive. So why is Python’s range(a, b) part inclusive, part exclusive? It doesn’t follow the math rules I’d expect.

I did some research and came across this snippet:

Python range is inclusive because it starts with the first argument of the range() method, but it does not end with the second argument of the range() method; it ends with the end – 1 index. The reason is zero-based indexing.

https://appdividend.com/2021/03/24/python-range-inclusive/

Now I have a lead, but still want to understand: How does zero-based indexing affect range inclusion? Here’s an explanation that made things *click* for me.

I think it may help to add some simple ‘real life’ reasoning as to why it works this way, which I have found useful when introducing the subject to young newcomers:

With something like ‘range(1,10)’ confusion can arise from thinking that pair of parameters represents the “start and end”.

It is actually start and “stop”.

Now, if it were the “end” value then, yes, you might expect that number would be included as the final entry in the sequence. But it is not the “end”.

Others mistakenly call that parameter “count” because if you only ever use ‘range(n)’ then it does, of course, iterate ‘n’ times. This logic breaks down when you add the start parameter.

So the key point is to remember its name: “stop“. That means it is the point at which, when reached, iteration will stop immediately. Not after that point.

So, while “start” does indeed represent the first value to be included, on reaching the “stop” value it ‘breaks’ rather than continuing to process ‘that one as well’ before stopping.

One analogy that I have used in explaining this to kids is that, ironically, it is better behaved than kids! It doesn’t stop after it supposed to – it stops immediately without finishing what it was doing. (They get this 😉 )

Another analogy – when you drive a car you don’t pass a stop/yield/’give way’ sign and end up with it sitting somewhere next to, or behind, your car. Technically you still haven’t reached it when you do stop. It is not included in the ‘things you passed on your journey’.

User dingles – https://stackoverflow.com/questions/4504662/why-does-rangestart-end-not-include-end

It makes more sense now. Python’s range(a, b) starts iterating at a and stops at b – right when it hits b, so that it does not include it.

Onward with learning, kids!

To cluck or not to cluck

I’ve been coding! Like the slow erosion of a river forming a canyon, I am steadily pecking away at Python to become a better programmer. Here is a lil project I did today. Why chickens? I’ll explain in a future post. Stay tuned! Bok bok bok!

# Magic 8 Ball - Ask a question, reveal an answer.

import random

name = "Heeju"

question = "Should I get hens this weekend?"

answer = ""
answer_2 = ""

# First question random answer generation
random_number = random.randint(1,10)

if random_number == 1:
  answer = "Yes - definitely."
elif random_number == 2:
  answer = "It is decidedly so."
elif random_number == 3:
  answer = "Without a doubt."
elif random_number == 4:
  answer = "Reply hazy, try again."
elif random_number == 5:
  answer = "Ask again later."
elif random_number == 6:
  answer = "Better not to tell you now."
elif random_number == 7:
  answer = "My sources say no."
elif random_number == 8:
  answer = "Outlook not so good."
elif random_number == 9:
  answer = "Very doubtful."
elif random_number == 10:
  answer = "Don't rush it. Give it some time."
else:
  answer = "Error (number outside of range)"

# Second question random answer generation

random_number_2 = random.randint(1,9)
if random_number_2 == 1:
  answer_2 = "Yes - definitely."
elif random_number_2 == 2:
  answer_2 = "It is decidedly so."
elif random_number_2 == 3:
  answer_2 = "Without a doubt."
elif random_number_2 == 4:
  answer_2 = "Reply hazy, try again."
elif random_number_2 == 5:
  answer_2 = "Ask again later."
elif random_number_2 == 6:
  answer_2 = "Better not to tell you now."
elif random_number_2 == 7:
  answer_2 = "My sources say no."
elif random_number_2 == 8:
  answer_2 = "Outlook not so good."
elif random_number_2 == 9:
  answer_2 = "Very doubtful."
else:
  answer_2 = "Error (number outside of range)"


if question == "":
  print("You didn't ask a question. Please ask one!")
elif name == "":
  print(question)
elif name != "":
  print(name,"asks:", question)
else:
  print(name,"asks:", question)


print("Magic 8-ball's answer:", answer)

print("Is this truly random?", answer_2)

The great reveal:

Garden Zone

As part of my Python programming practice, I came up with a module and function that randomly generates 5 USDA Plant Hardiness Zones / Garden Zones.

def randomgardenzone():
    test_list = ['a', 'b']
    for i in range(1, 5+1):
        x = randint(1, 13)
        res = choice(test_list)
        print(x, res)

randomgardenzone()

For context: the US Department of Agriculture has 13 designated “zones” for the country, based on the average annual min temperature. Each zone number is 10 degrees F apart. There is a further subdivision of zones with a letter ‘a’ or ‘b’, where ‘b’ is 5 degrees F warmer than ‘a’.

These zones are useful for gardeners because we can confidently plant specimens that are hardy (cold/frost/freezing temperature tolerant) to their zone. This is why mango and bananas don’t grow in Minnesota, while they may thrive in a Floridian garden. The Minnesotan would have to have a toasty heated greenhouse in order to cultivate mango trees or bananas through their winter. (Did you know the banana plant is actually an herb, not a tree?)

I’m curious how some plants are able to be cold-hardy and resist freezing. When it gets below 32’F, the water in the plant cells wants to freeze and expand. This would rupture the cell walls and make the plant loose its structure, becoming frost damaged, mushy, and sadly, not salvageable. I heard that cold-hardy plants contain a natural antifreeze that prevents this. I’m curious how antifreeze works, and if it’s similar to what’s used in automobiles. Dianthus is an example of a common plant that is cold-hardy (you can grow them in Alaska), and in fact they need a cold season in order to thrive.

Python Things I learned:

  • Use “for i in range(1, 5)”, not just “for i in (1,5)”. A simple doh!-type mistake!
  • range(a, b) works like [a, b) – it is exclusive of the b value. However, randint(a, b) is inclusive of the b value.
  • The “choice( )” function from the random module let’s you pick a random item from a list. This was useful to pick the zone letters ‘a’ or ‘b’, since randint( ) is only used to pick a random integer.

“I don’t know where to start”

This guy Zook has a great noggin and head.

I’m reading “Own Your Weird” by Jason Zook, and want to share this golden nugget with you. Maybe it’ll lift you up if you’re embarking on a hard quest like me (like learning computer programming, for real this time, or reducing material consumption/living minimally and realizing how much is “enough”):

“Our brains have this mystical, magical, commanding power over us. It can be incredibly difficult to challenge our own thoughts. Even if we have data from other sources, we often still can’t get past our own mental barriers.

Assumptions about starting your next business/project/whatever:

I’m amazed at how often I hear from people who are talking themselves out of being successful…[with] phrases like:

“I don’t know where to start.”

By picking up this book, you are starting. By wanting to start, you are starting. So check that one off the list right now. But you know those things aren’t enough. Eventually, you just have to put one foot in front of the other (or click a mouse one click in front of the other?). Start small, start scared, but just start.