Code Writeup: Writing a GCD Function

Over the last couple of months, I haven't been able to set aside as much time to practice coding as much as I would like. However, wanting to stay in practice I have tried to set aside some time on weekends in order to do one or two challenges on checkio.org. One of the nice things about the challenges posted there is that they don't simply ask you to solve problems, but you have to solve them in a way that manages to past tests that they have set for each problem. Oftentimes, I will come up with a solution to a problem, but will need to then go back to think about how I can either make my solution more efficient or handle various edge cases the authors have included. Today, I wanted to go through and focus on one particular problem to analyze various ways of solving it.

The problem in question was fairly standard, asking me to figure out the greatest common divisor for a set of integers provided as arguments to a function. It is worth noting that the integers were positive (0 < x <= 2**32), and the approaches presented here may need some additional logic if we wanted to extend the solutions to include negative integers (return the negative of the gcd of the absolute values of the arguments) or 0 (either return the non-zero number or I think either undefined or zero for gcd(0,0)).

Initial Approach

def greatest_common_divisor(*args):
    """Brute force gcd function."""
    for possible_divisor in reversed(range(1, min(args) + 1)):
        if all([arg % possible_divisor == 0 for arg in args]):
            return possible_divisor

A Slight Improvement

What seems like the most obvious way to improve this solution is to reduce the number of factors we need to check. Since we're only interested in the largest number that evenly divides all of our arguments: it doesn't really make sense to search from lower to higher possible solutions, since we can't know if we've found the gcd till the full range has been checked. But if we search for possible factors starting from higher numbers and working to lower numbers, then we can stop as soon as we hit a number that evenly divides all of our arguments.

def greatest_common_divisor(*args):
    """Brute force gcd function."""
    for possible_divisor in reversed(range(1,min(args))):
        if all([arg % possible_divisor == 0 for arg in args]):
            return possible_divisor

The interesting thing to think about with this solution, is that it doesn't improve worst case performance. If the gcd of all of our arguments is 1, then we still end up searching trying min(args) factors. However, it does improve best-case performance: In the best case requiring only 1 potential divisor to be checked.

Using prime factorization

My next approach was as follows: First, find the prime factorization for each number we are considering. For each of the prime factors, find how many times that factor occurs in prime factorization for each number. Finally, for the prime factors that occur in the prime factorization of all of the numbers we're trying to find the gcd for, take the product of those factors raised to however many times it occurs the least across all of our prime factorizations.

def prime_gen():
    found = [2]
    tocheck = 2
    yield 2
    while True:
        tocheck += 1
        divisor_found = False
        for prime in found:
            if tocheck % prime == 0:
                divisor_found = True
                break
        if not divisor_found:
            found.append(tocheck)
            yield tocheck


def factor(number):
    # print("factoring: {}".format(number))
    seed = number
    factors = []
    while seed > 1:
        testing_primes = True
        prime_generator = prime_gen()
        while testing_primes:
            prime_to_test = next(prime_generator)
            if seed % prime_to_test == 0:
                factors.append(prime_to_test)
                seed = seed // prime_to_test
                testing_primes = False
        # print("Seed: {}, Factors: {}".format(seed,factors))
    return factors


def greatest_common_divisor(*args):
    """
        Find the greatest common divisor
    """
    factor_dicts = []
    # calculate the prime factorization for each argument
    for arg in args:
        factorization = factor(arg)
        factor_powers = {}
        for unique_factor in set(factorization):
            factor_powers[unique_factor] = factorization.count(unique_factor)
        factor_dicts.append(factor_powers)
    # get factors that appear in all the prime factorizations
    common_factors = set(factor_dicts[0].keys())
    for factor_dict in factor_dicts:
        common_factors = common_factors.intersection(factor_dict)
    # if the set of common factors is empty 1 is the gcd
    if len(common_factors) == 0:
        return 1
    else:
        # calculate the product of the minimum power of each common factor
        minpowers = dict()
        for common_factor in common_factors:
            minpowers[common_factor] = min(
                [factor_dict[common_factor] for factor_dict in factor_dicts])
        minpower_exponentiated = [
            key**value for key, value in minpowers.items()]
        gcd = 1
        for powered_factor in minpower_exponentiated:
            gcd = gcd * powered_factor
        return gcd

While this approach works in theory, it isn't immediately apparent whether it will be better per se than the second solution. In theory, it seemed like it could be better if we memoized our function for finding primes, and needed to make more than one call to the greatest_common_divisor function.

This third solution still failed to pass the full test suite however. It handed all of the simple test cases that it was thrown at, but would hang on the first edge case. Uncommenting the print statements in the factor function produced the following results for the first two numbers in the test it tried to factor.

factoring: 2167650657
Seed: 722550219, Factors: [3]
Seed: 240850073, Factors: [3, 3]
Seed: 4544341, Factors: [3, 3, 53]
Seed: 2543, Factors: [3, 3, 53, 1787]
Seed: 1, Factors: [3, 3, 53, 1787, 2543]
factoring: 1496767446
Seed: 748383723, Factors: [2]
Seed: 249461241, Factors: [2, 3]
Seed: 83153747, Factors: [2, 3, 3]
Seed: 4376513, Factors: [2, 3, 3, 19]

It appeared that it was working fine for some numbers, but was needing to calculate fairly large primes for others. It managed to find the factors [2, 3, 3, 19] for 1496767446, but then would hang. As a small digression, if we want to figure out what prime it is struggling to calculate, we can do so with sympy fairly easily.

>>> from sympy import primefactors
>>> primefactors(1496767446)
[2, 3, 19, 4376513]

So there's the problem, it got stuck trying to find prime factors of 4376513 which was itself prime.

With Euclid's Algorithm

So with the previous approach choaking for numbers that include large prime factors, it was back to the drawing board. I had previously heard of Euclid's Algorithm, but hadn't taken the time to look into it, instead opting previously for a more number theoretic approach. (In retrospect, Donald Knuth's quote about the Euclidian algorithm on the information page for the challege probably should have pushed me in this direction in the first place, but here we are.)

def gcd(A,B):
    """ Recursive implementation of euclid's algorithm"""
    if A == 0:
        return B
    if B == 0:
        return A
    quotient = A//B
    remainder = A%B
    # A = B*quotient + remainder
    return gcd(B,remainder)

def greatest_common_divisor(*args):
    """
        Find the greatest common divisor
    """
    greatestCommonDivisor = min(args)
    for a in args:
        greatestCommonDivisor = gcd(greatestCommonDivisor,a)
    return greatestCommonDivisor

Note here that this solution works in part because the gcd is associative. Another equally valid approach for applying the function to an arbitrary number of arguments would have been to use reduce(gcd, args) (In python 3.0+, you would first have to include from functools import reduce in your imports).

Best approach

The best approach however, is probably to just rely upon the standard library when possible.

from fractions import gcd

def greatest_common_divisor(*args):
    """
        Find the greatest common divisor
    """
    greatestCommonDivisor = min(args)
    for a in args:
        greatestCommonDivisor = gcd(greatestCommonDivisor,a)
    return greatestCommonDivisor

If we take a look at \lib\fractions.py , we can see that the gcd function included in the standard library uses an iterative version of Euclid's Algorithm.

def gcd(a, b):
    """Calculate the Greatest Common Divisor of a and b.

    Unless b==0, the result will have the same sign as b (so that when
    b is divided by it, the result comes out positive).
    """
    while b:
        a, b = b, a%b
    return a

Summary

This challenge ended up being rather instructive for me for a couple of reasons. First, it helped me to realize that I need to take more time to analyze and research approaches to solving a problem before diving in and trying to solve it. Second, it really reinforces the point that if you really don't need to calculate primes, then you may want to question whether you should. Finally, working through different possible solutions got me to start thinking about the time-complexity of the code I've written.

Great Educational Youtube Series

For awhile now, I have been interested in collecting great educational videos and resources on this blog. Currently, I have various links to resources available under the "Lists-O-Links" tab in the navigation bar, but I'm still searching for the right format. But today I would like to take a few minutes to highlight a few really great educational video series I've come across on YouTube that have really impressed me.

Computerphile (YouTube Channel)

Features short videos (typically 4-8 minutes) on various computer related topics such as Chomsky's Hierarchy and computer vision.

Numberphile (YouTube Channel)

Sister channel to Computerphile. Features short videos (typically 4-8 minutes) on numbers.

Introduction To Higher Mathematics Video Series (YouTube Playlist)

A great video series I started working my way through last summer. Bill Shillito's "Introduction To Higher Mathematics" video series introduces the viewer to concepts in mathematics such as modular arithmetic, topology, and group theory.

Chaos - A mathematical adventure(Youtube Playlist)

9 part video series that features animations explaining concepts related to chaos theory and dynamic systems.

Carneades.org (Website, YouTube Channel)

Carneades.org is a great YouTube series on philosophical topics that is worth checking out for anyone interested in learning more about formal logic. They have been running a fantastic "100 Days of Logic" series covering categorical logic, predicate calculus, various modal logics, ect. If that sounds like something that you would like to learn more about: www.carneades.org is a good portal to a lot of their video series, and all the videos can be found at their YouTube channel.

Flame0430(YouTube Channel)

Flame0430's YouTube Channel hosts a really fantastic series of interviews with leading philosophers such as W.V. Quine, Hilary Putnam, Bernard Williams, A.J Ayer, and more.

Javascriptlets for Wayback Machine

Quick post today, the following are two javascriptlets for archiving and retrieving pages from the Internet Archive (originally found here). If you aren't familiar with javascriptlets, the general idea is that they function similar to bookmarks but rather than just taking you to a webpage they run some code written in javascript. To add them, open the prompt that you would use to add a webpage to your bookmarks, and copy and paste the code in where the address for the website would typically go.

Image of javascriptlet bookmark

Saving Pages to Wayback Machine

This first scriptlet "capture(s) a web page as it appears now for use as a trusted citation in the future."

javascript:location.href='http://web.archive.org/save/'+location.href

Get An Archived Version of a Page From The Wayback Machine

This second piece of code is useful for anytime a webpage is down. After adding it you can just click the bookmark you have it saved to, and if someone has previously saved the page to the internet archive, then you can get an archived copy of the webpage.

javascript:void(window.open('https://web.archive.org/web/*/'+location.href));

Software Pick: Instant Boss A Minimalist Time Manager

A good time management tool can be invaluable. But its too easy to get sucked into spending time trying to figure out how to regiment your time. The system that I've found to work best for me is a small program called "Instant Boss". You give it the number of minutes you would like to work for, the number of minutes you would like to break for, and the number of times you would like to repeat the process. And it will display a countdown timer, and play a sound effect when time is up. Work for 10 minutes. Break for two minutes. Repeat...

Instant Boss Screenshot

Links

Impossible Colors

I very vividly remember the first time I saw a new color. I was a freshman at Indiana University at the time, browsing through the stacks of Wells Library. On the 6th floor of the library (currently being remodeled), I hit upon an old issue of the journal Philosophical Psychology, and in it I found an article by Paul Churchland entitled Chimerical Colors: Some Phenomenological Predictions from Cognitive Neuroscience. I'm ashamed to say that I have never found the time to dig into the article itself: too striking are the demonstrations its contains. See Frank Heile's response to this Quora question for a summary of the paper and a few of the figures.

To keep this post brief, I'll leave a few links for the reader to explore in addition to the links already mentioned.

  1. The Eclipse of Mars Illusion
  2. Chimerical Colors: Some Phenomenological Predictions from Cognitive Neuroscience
  3. Impossible Colors (Wikipedia)
  4. Hyperbolic Orange And The River To Hell

Markdown Version of Thumbnail With Nikola

The following takes an image I have placed in a folder, Images, in the base directory for nikola., and produces the following output.

../images/lighthouseterror.thumbnail.jpg
(Note how clicking the image will open a lightbox.)
<center>
<div class="col-md">
<a class="col-md reference external image-reference" href="../images/lighthouseterror.jpg"><img alt="../images/lighthouseterror.thumbnail.jpg" src="../images/lighthouseterror.thumbnail.jpg"></a>
</div>
</center>

Here is what my folder structure looks like.

root
| 
+ --+ Images
|   |_  lighthouseterror.jpg  # Image You Want To Add
+ --+ Posts
    \__2015
       \__ This post.md # Location of This Post

Sublime-Snippet

Here's a snippet version of the above that I use with Sublime text editor. Unfortunately, there doesn't seem to be a good way of adding tab completion to the snippet.

<snippet>
    <content><![CDATA[
<div class="col-md">
<a class="col-md reference external image-reference" href="${1:../images/someimage}.${2:jpg}"><img alt="${1:../images/someimage}.thumbnail.${2:jpg}" src="${1:../images/someimage}.thumbnail.${2:jpg}"></a>
</div>
]]></content>
    <!-- Optional: Set a tabTrigger to define how to trigger the snippet -->
    <!-- <tabTrigger>hello</tabTrigger> -->
    <!-- Optional: Set a scope to limit where the snippet will trigger -->
    <!-- <scope>source.python</scope> -->
</snippet>

Test Markdown Post

Including an image file

suppawow

To include an image file. Use the format:

![ ALTTEXT ](/images/FILENAME)

To center an image. Use the html center tags.

<center> ![suppawow](/images/suppawow.png) </center>

To scale the image first, I will probably just be easier to use html, since I haven't figured out how to use alternative implimentations of markdown with nikola yet (Though mau may be an option.).

<img src="/images/suppawow.png" alt="ALT TEXT" style="width: 200px;"/>

ALT TEXT