Sunday, January 25, 2009

Automated Mirroring and the BitTorrent Protocol

As an avid Ubuntu supporter, I do what I can in my limited amount of free time to help. I submit hardware specs with my live USB drive, submit and comment on bug reports in Launchpad, create and vote on ideas in Ubuntu Brainstorm, my girlfriend gifted a shirt to me from the Ubuntu Shop (it was my idea), and tell everyone I know about this amazing operating system that I use.

There is one more big thing that I do to help Ubuntu; I seed its disk images. Currently, I am seeding the following images:
  1. Jaunty Alternate 64 bit
  2. Jaunty Alternate 32 bit
  3. Jaunty Desktop 64 bit
  4. Jaunty Desktop 32 bit
  5. Ubuntu 8.04.1 DVD 64 bit
  6. Ubuntu 8.04.1 DVD 32 bit
  7. Ubuntu 8.04.2 Alternate 64 bit
  8. Ubuntu 8.04.2 Alternate 32 bit
  9. Ubuntu 8.04.2 Desktop 64 bit
  10. Ubuntu 8.04.2 Desktop 32 bit
  11. Ubuntu 8.04.2 Server 64 bit
  12. Ubuntu 8.04.2 Server 32 bit
  13. Ubuntu 8.10 Alternate 64 bit
  14. Ubuntu 8.10 Alternate 32 bit
  15. Ubuntu 8.10 Desktop 64 bit
  16. Ubuntu 8.10 Desktop 32 bit
  17. Ubuntu 8.10 DVD 64 bit
  18. Ubuntu 8.10 DVD 32 bit
  19. Ubuntu 8.10 Server 64 bit
  20. Ubuntu 8.10 Server 32 bit
That is 26.8 GB of data. I have uploaded 638.0 GB since installing Ubuntu 8.10 and uploaded over 1 TB while using Ubuntu 8.04. Furthermore, I expect my average upload rate to increase since I only started seeding the DVD and 8.04 images in the last few days. (UPDATE: Now I average about 500 KB/s upload and have uploaded 910.6 GB since installing Ubuntu 8.10.)

I am able to achieve these numbers because of my superb Internet connection in my dorm room at Iowa State University. I used to measure my connection speed today. Using a server in the Twin Cities, Minnesota (44 ms ping at ~150 miles away), I measured 19,987 Kb/s down and 10,436 Kb/s up (or 20.5 MB/s down and 10.2 MB/s up). I have witnessed uploads to individual leechers at 5+ MB/s.

Even though I have helped out so much in the distribution of files, my Internet connection can upload more and my hard drive can store more files, plus I have to manage (or start) the torrents myself. The Ubuntu site makes it easy to find direct HTTP downloads of the newest images, but it is nontrivial to find all 20 of these torrent files. Moreover, when new alphas or maintenance packs are released, I have to manually get the new torrent files. It would be better if my ability to help Canonical upload were automated.

I wish there was a client-server program that would automate this process. Canonical would run the server version and people like me would run the client version. This program would essentially be a type of a distributed computing program. Normally, distributed programs exist in order to access more computational power. This program would be distributed in order to access more upload bandwidth. On the client side, I notify the server of my existence and specify the amount of disk space I will allocate. Then the server decides what files I should be uploading, which would work much like the BitTorrent protocol where rare pieces are shared first.

All the clients could certainly share their files over BitTorrent, but this program could also allow users to directly download from a client, similar to how mirrors work now. Moreover, this program could work with more than just disk images. In Ubuntu Brainstorm Idea 7792, someone requested that apt-get use the BitTorrent protocol. Going further than that, this program could also help distribute all the software updates and repositories: Main, Universe, Restricted, and Multiverse.

In summary, I want to be able to automatically help Canonical upload any and all files either using direct HTTP or BitTorrent.

Friday, January 16, 2009

James McCanney and his book Calculate Primes

While surfing the Internet last week, I found this torrent: 2,650 year-old math problem solved by James McCanney (a.k.a. Jim McCanney). In this two hour audio file, James McCanney is interviewed by Brad Walton (of the WCCO radio station in Minneapolis, Minnesota, USA) about his book Calculate Primes. It was recorded on March 17, 2007 (they wished each other a happy Saint Patrick's Day) and Walton also blogged about the interview on the same day.

Several of the following references that I link to are surely not reliable sources, but there are not too many references to be found, so be sure to make your own conclusion instead of just blindly believing mine.

Before I talk about the Calculate Primes, let me share what I found about McCanney himself. In the interview, Walton continually refers to McCanney as a professor, but this title misleading. McCanney's website provides a long bio highlighting his past. The first part is about his education history, including the fact that he used be a "introductory instructor" at Cornell University (in Ithaca, New York, USA) for the physics department and then the math department. However, he was fired from both positions for his radical theories in physics (at least according to McCanney's own bio, which I believe). In this forum post, McCanney claims that he has also taught at other schools and "earned" the title of professor. My guess is that he is referring to schools in South America that were mentioned in his bio, which does not count in my book. So as far as mainstream academics is concerned, McCanney only reached the level of introductory instructor before being fired, twice.

McCanney has many theories in phyiscs which are not accepted by the rest of the academic community. The website Bad Astronomy has a page dedicated to McCanney and his more "popular" theories.

Most of the hits on Google for "James McCanney" are related to his work in physics. It seems that everyone disagrees with almost all of his work. However, it is possible that his work with prime numbers is valid, so I will give McCanney's results a chance to convince me.

I was very excited to listen to the interview but also wondered why I had not heard about this before. I listened to the whole interview, but McCanney only discussed specific details about his findings a few times. From what I gathered, McCanny created some sort of function that is repeatedly applied to a set of numbers. He said that the initial set is {0, 1}. I think that the numbers in this set are added and subtracted from what McCanney called "magic numbers." Eventually McCanney said that "magic numbers" were his simplified term for sequential prime products, which are also known as primorials. McCanney also said that this process will produce some "false primes" (some composite numbers). However, I thought he said that in the next iteration of this process, they would no longer be in the set.

In this forum post, an owner of the book said:
"McCanney has to be the worst speller I have ever encountered. McCanney apparently does not believe in proofreading. His books have many typos and incomplete or ungrammatical sentences. Publishing material in this state is almost an insult to the reader."

The obvious lack of proofreading led me to find out who published this book. Both McCanney's website and Amazon had this information. Calculate Primes is self-published by " press."

The three hour DVD that comes with the book appears to be of the same quality. In this forum, the eleventh poster says:
"I've looked at the DVD a bit. It's not exactly Hollywood. It appears to be essentially a home video. ...I would mention that it seems to be a low budget production."

In this forum post, another owner of the book said:
"...Mr. McCanney changes the names of sets during the book, and sometimes uses different names for the same thing even in the same equation."

This same person goes on to say:
"The (infinite) union of repetition groups, each with an infinite number of members is effectively a sieve of Eratosthenes as far as I can see."

In agreement with this last person, the people on this forum provided some math from McCanney's function and concluded that McCanney's work was probably a reinvention of the Sieve of Eratosthenes. I also agree with this conclusion. Look at the numbers on the left hand side when using the "magic number" 2. They are all numbers that are not multiples of 2. When the "magic number" is 6 (= 2 x 3), the numbers on the left hand side are not multiples of 2 or 3. Finally, when the "magic number" is 30 (= 2 x 3 x 5), the numbers on the left hand side are not muliples of 2, 3, or 5.

Given the limited information that I could find on the Internet, I have concluded that McCanney's book Calculate Primes does not contain a new, revoultionary way to calculate prime numbers. In order to learn more, I would have to buy the book, but I believe that this would only cause me to be more convinced that McCanney did indeed reinvent the Sieve of Eratosthenes.

Friday, January 9, 2009

Google is Almost Perfect

Over the last four years, I have become a fanboy of many things. Google is one of them. Google is part of the final straw that made me decide to start this blog because it is hosted by Google. However, the look and feel was not exactly the same as other Google products, so it was pretty obvious to me that Blogger was purchased by Google sometime in the past. I wish the hyperlink button used JavaScript (like Gmail does) instead of opening a new window so that I can start the hyperlink process, get the URL, then come back and paste it. I am picky; I don't deny it.

Google is the number one search engine and number two website (to Yahoo). They help set the standard that other websites feel that they have to match. Specifically, most (if not all) of the services of Google are free, so sites like Twitter continue to provide the services at their site for free even though they have no alternative income. As a company, Google can provide its services for free because it makes gobs of money through ads.

Google has done so many things right on the Internet. Their search engine makes so many things possible. They support numerous open source software projects through their Google Summer of Code program. (I would actually like to get on a project next summer.) Google even has the motto of "Don't be evil." However, Google has done one thing that I deem as evil.

In 2005, many people were talking about book scanning projects. Many different companies, organizations, and people are trying to digitize all books. Google Book Search is the result of Google's lone efforts. Google is working alone because they do not want to share their scanned books with others, which does not sound like the Google I know and love. Google dominates because of its exceptional searching abilities. Google should not care who is holding the books; their algorithms will be what everyone uses.

Most people had problems with Google's scanning actions because they were scanning books that were still in print and copyright. The other book scanning projects are sticking to books that are out of copyright and out of print. Since the controversy, which included many class action lawsuits, Google has negotiated deals with the copyright holders in which Google can continue its work.

First Post

Hello everyone.

I have thought about starting a blog for a while now. I thought Twitter  would be my answer since I did not think I wanted to write that long of posts, but I hit the character limit more often then not.

I was unsure if I would be able to create a name for my blog as good as my girlfriend's, but I think I did a good job. Recursively enumerable is a term from theoretical computer science, the subject in which I will be attending graduate school. In this context, it means that I will eventually write a blog post for every possible topic (given an unlimited amount of time).

I hope that you will find that my blog is worth reading.