codenode

DB<3> x $code

Why QuiBids is a scam

Posted on July 29th, 2010 by Daniel Nichter

QuiBids is a penny auction website with an irresistible hook: win awesome stuff at bind-blowing prices, like a brand-new Apple iPad 16 GB Wi-Fi for $4! It seems unbelievable but it’s true. It’s also very deceptive.

I spent a little over $300 today and won only some Pyrex contains (I did need them, at least). Before I expose the penny auction scam let me make clear: I’m not blogging this in a fit of rage over my $300 Pyrex containers. Believe it or not but I don’t care about the money. Also, I intended to spend that much because I was sure no one would out bid me at that price level (I was bidding for an iPad and historically they’re won at far less than $300). My motivation for this blog is simply to lay bare the plain details and facts about how (this) penny auction website works financially, to show and prove to you that, unless you enjoy seeing your money disappear at a fantastic rate whilst participating in the grotesque enrichment of the company to which it went, you should not participate in penny auctions.

I speak of QuiBids because it’s all I know. I can only guess that other penny auction websites operate similarly. If QuiBids objects to this blog post then that’s too bad because I’m protected by free speech and nothing I’m about to say is false to the best of my knowledge. (I will correct any factual inaccuracies pointed out to me.)

Let’s use this auction as an example. At the time of writing, the bid price for that iPad is $157.26. Let’s simplify and call it $150. Now here are some important facts:

  • Users (i.e. you and me) must buy bids; users do not bid with real money.
  • Each bids cost $0.60 (sixty cents). E.g. 75 bids = $45.
  • The price of items, like our example iPad at roughly $150, are bid up in $0.01 (one cent) increments (this varies; sometimes the increment is more).
  • The user who wins can buy the item at its final bid price.

Let me first emphasize the last point. If an item is bid up to $150 and you win it, then you buy it for $150 plus all the money you spent on your own bids, that is, $0.60 times the number of bids you made. So if you made 100 bids then you spent $60 on bids (100 * $0.60). Therefore the real final price for your item is $210: $150 for the bid price + $60 spent on bids.

Obviously you want to place as few bids as possible, but that seems nearly impossible to do if you have any chance of winning a “sexy” item like an iPad because, as our example auction shows, the bidding may drag on for hours. Our example auction began sometime this morning around 11am–I know because I place 278 bids ($166.80). This presents a very high barrier: bid competition. Some auctions end after less than 100 bids, and other auctions require thousands of bids. It’s easy to calculate how many bids an auction has received: for auctions with $0.01 cent increments the number of bids is simply PRICE * 100. So our example is 150 * 100 = 15,000 bids. The math is simple: 1 bid = 1 cent increment in price and there are 100 cents in a dollar so there are 100 bids in each dollar of the price.

Now wrap your mind around this: each bid cost someone $0.60, so if an item, like our example iPad, has received 15,000 bids then that’s 15,000 * $0.60 = $9,000 that QuiBids receives for one iPad.

Someone is going to win that auction, and let’s say it’s someone who comes into the auction late and bids only 100 times, thus costing their self $60 plus whatever the final price of the time is. Right now, since I’ve been type, the price has increase from $157.26 to $160.78. Let’s be generous and hypothesize that someone won it at $160. So their final price is $160 + $60 = $220. And $160 at 1 cent increments requires 16,000 bids minus the 100 the winner placed meaning that the losers spent $9,540 (15,900 bids) for nothing. I am one such loser.

QuiBids is great if you’re that one in a gillon winner of an iPad or MacBook Pro after just 10 bids or $6, but my research and the QuiBids Help/FAQ are clear in stating that that is a very rare occurrence.

The bottom line: penny auction websites like QuiBids are a scam because countless people will waste countless hours of their lives only to lose thousands of dollars on a single item which costs, at retail, only a few hundred dollars, and all their lost money becomes the sole profit of the company which cleverly masques the dollars-and-cents reality of its game in a shroud of penny-bid lingo.

Parsing SQL WHERE clause

Posted on July 26th, 2010 by Daniel Nichter

Parsing a SQL WHERE clause is really difficult. A number of people have told me, “looks like grammar”, implying that I should use some traditional grammar/rule-based solution like yacc or bison, but I can’t because Maatkit tools (of which this is a part) must have minimal external dependencies. Plus, I don’t want or need to parse SQL fully; I bet only MySQL can really do that. I just need a 90% solution like the rest of my little Perl SQL parser. So I wrote my own SQL WHERE parser in about 100 lines. I’ll just reproduce the code comments here which explain, in general, how it works:


# This is not your traditional parser, but it works for simple to rather
# complex cases, with a few noted and intentional limitations.  First,
# the limitations:
#
#   * probably doesn't handle every possible operator (see $op)
#   * doesn't care about grouping with parentheses
#   * not "fully" tested because the possibilities are infinite
#
# It works in four steps; let's take this WHERE clause as an example:
#
#   i="x and y" or j in ("and", "or") and x is not null or a between 1 and 10 and sz="this 'and' foo"
#
# The first step splits the string on and|or, the only two keywords I'm
# aware of that join the separate predicates.  This step doesn't care if
# and|or is really between two predicates or in a string or something else.
# The second step is done while the first step is being done: check predicate
# "fragments" (from step 1) for operators; save which ones have and don't
# have at least one operator.  So the result of step 1 and 2 is:
#
#   PREDICATE FRAGMENT                OPERATOR
#   ================================  ========
#   i="x                              Y
#   and y"                            N
#   or j in ("                        Y
#   and", "                           N
#   or")                              N
#   and x is not null                 Y
#   or a between 1                    Y
#   and 10                            N
#   and sz="this '                    Y
#   and' foo"                         N
#
# The third step runs through the list of pred frags backwards and joins
# the current frag to the preceding frag if it does not have an operator.
# The result is:
#
#   PREDICATE FRAGMENT                OPERATOR
#   ================================  ========
#   i="x and y"                       Y
#                                     N
#   or j in ("and", "or")             Y
#                                     N
#                                     N
#   and x is not null                 Y
#   or a between 1 and 10             Y
#                                     N
#   and sz="this 'and' foo"           Y
#                                     N
#
# The fourth step is similar but not shown: pred frags with unbalanced ' or "
# are joined to the preceding pred frag.  This fixes cases where a pred frag
# has multiple and|or in a string value; e.g. "foo and bar or dog".
#
# After the pred frags are complete, the parts of these predicates are parsed
# and returned in an arrayref of hashrefs like:
#
#   {
#     predicate => 'and',
#     column    => 'id',
#     operator  => '>=',
#     value     => '42',
#   }
#
# Invalid predicates, or valid ones that we can't parse,  will cause
# the sub to die.

The full code is SQLParser.pm which is part of Maatkit.

I’m not sure if I’m using the term “predicate” correctly so don’t quote me, but that’s a trivial concern next to whether the code works or not (it does; it’s tested).

Real world vs. cyber world

Posted on July 23rd, 2010 by Daniel Nichter

“Cyberscape” is not a term generally used any longer. In fact, it seems to me that no one really speaks of a division between the real world and the cyberworld any longer. Rather, people speak of how close and integrated they are and of technologies and ways to bring them even closer. I think a whole other domain of existence is quietly but rapidly merging with our “natural” existence and unless we are careful the result will not be favorable to humans.

Me and my best friend, both old-school computer geeks (about 20 years of experience; not as old as some but old enough to have began on dial-up bbs rather than the Internet we know today), were at a local chophouse-brewery and we noticed a couple sitting across from one another but also infinitely far apart. Each of them was gazing at a smart phone, jacked in to the cyberworld whilst sitting in a real-world brewery. Then it struck me: all our electronic, connected devices (netbooks, iPads, smart phones, etc.) are windows through which we gaze longingly into the cyberworld.
Salvador Dali - Person at the Window
The Facebook phenomenon perplexes me and my friend. Neither of us have Facebook accounts nor want them. Yet both of us are on Facebook, against our will or wish. What is so alluring to hundreds of millions of people about social networking? I ask my friends and the most common, nearly universal answer is: “It’s a way to keep in touch with people.” That’s a nice sentiment but I argue that “keeping in touch” requires the ability to “touch” the person or something from them that may at least bear lingering signatures of their them (i.e. their handwriting). Online social networking–or online anything–gives the impression of individuality, of discrete human beings behind the messages, but the fact is that nothing about the cyberworld uniquely or necessarily links to the real world. Therefore, you@Facebook is not really you; it could just as easily be me: simply give me your username and password. After a few minutes of studying your lexicon and idiomatic tendencies, I can begin masquerading as you in the cyberworld because fonts on screens appear the same when typed by me or anyone else. That is to say: there are no signatures of the human behind the messages. (Unless you’re using PGP or something, but that’s rare to never.) And the argument that handwriting and other real-world traits can be duplicated in a similar manner does not hold water because, sure, you may be able to duplicate my handwriting but you can never duplicate me (unless you have a full Hollywood staff of makeup artists and specialists and just so happen to be over six feet tall).

And so what? So what if the cyberworld connects but loosely and vaguely to the real world? So what if “keeping in touch” via Facebook is categorically different than keeping in touch by meeting a friend at a brewpub or writing a friend a hand-written letter? And so what if we gaze through windows at the cyberworld we’ve created and live semi-alternates lives there where whether I’m happy or not, single or not, “here” or not depends on my “status”? Aren’t two lives better than one? Isn’t it awesome that I can chat and message my friends all over the world, know what my ex is up to, and feel part of something greater when I join the Tupperware party? Perhaps there’s nothing wrong with all this… for the moment.

What I “fear” are the affects of the cyberworld on human aspects of living, like time. The real world has an “arrow” of time, or at least that’s our perception. During the summer, the sun takes about 15 hours to rise and set. A letter takes half an hour to an hour to write, then more time to fold, address, stamp and take to the post office, then days to arrive at its destination. A conversation between me and my friend over beers takes hours. Meeting someone and becoming their friend–a real friend–takes months. But these activities in the cyberworld either don’t happen (sunrise/set) or can happen in seconds or minutes. I think this has adverse affects both psychological and sociological.

Psychologically, we simply cannot handle or make useful use of the information torrent that we incite or invite. I can receive a hundred emails or messages at once, but how or why would a normal person ever receive a hundred letters at once? If one morning you found a hundred letters stuffed in your mailbox you would probably be very surprised and surely overwhelmed if you had to respond to them all in a few days. But most of us regularly receive hundreds (if not thousands) of emails and texts each week and this does not strike anyone as strange, probably because we can and do respond to them all. What is strange, to me at least, is that mass electronic communication does not seem to cause the kind of anxiety that an equal or even fractional amount of physical communication would cause. I emphasize “seem” because I think that the anxiety is actually there but we’re too busy to notice it because the e-flood never ceases and only grows larger as we open more windows to the cyberworld.

“How do the drops of water know themselves to be an ocean?”

Sociologically, I fear that all this “keeping in touch” is actually diminishing real human contact. Why visit people when we already know what they’ve been doing, their relationship status, etc.? For example, I haven’t seen my other best friend in weeks because she’s been swamped with school and work, but we exchanged a few quick texts and scheduled a dinner and drink date tonight. We will meet face-to-face and I’ll catch her up on the mass of changes in my life recently. If I was on Facebook she would probably already know about these changes and our meeting would be a mere recapitulation. But since I’m not on Facebook, she’s very excited to hear about why me and my girlfriend are moving in together. And there it is: “excited to hear”. What the cyberworld and its always-on, tip-of-the-fingers cyber-universe of knowledge denies us is wonder, suspense and excitement. These humans aspects and feelings do not translate between the worlds; they’re uniquely ours, real. Thus my argument is that the cyberworld dilutes the real world, denies us our yeast, so to speak, that gives rise to that unique pleasure of “catching up” with friends, sharing news, learning with shock and awe that after dating for only a month me and my girlfriend have begun to live together and we enjoy it very much.

Do I have a point is all this? Yes; it’s this: less cyber, more real. No matter how sick the latest iPhone may be, no matter how fantastic its apps, how fast or sexy the netbook, how pervasive the 3G coverage, how many people we can stay in “touch” with, we are real people in a real world and lest you become a Borg, this will never change. Instead of up-playing the cyberworld and the gadgets we fabricate to peer into it, we should focus only on how such contrivances enhance our real-world lives. We should think about and look closely at how the cyber affects the real and ask ourselves: is the real benefiting? If not, then the cyber should be turned off, abandoned.

I’m not anti-technology; I’m pro-reality. Technology can help us “stay in touch”, but we must guard against mistaking the electronic sense of “touch” for the real sense of touch. Having more friends online than we ever actually see should be a red flag, not a badge of honor. Feeling that we never need to pen a letter is a loss. If you’ve not tried it recently, I encourage you to. You will find that the mind works very differently at the pace of pen and paper than it does at the pace of T9.

In the final analysis we should all ask ourselves: if we can have and exist in only one of the two worlds, the real or the cyber, which would we choose? I would choose the real for its sunrises and sunsets, its gourmet coffee, its summer rain, its caress of a loved one.

Single pass replace with Perl regex \G anchor

Posted on June 24th, 2010 by Daniel Nichter

Let’s say we have the query SELECT a, b, c FROM tbl WHERE id=1 ORDER BY a ASC, b ASC, c ASC and we want to remove the redundant ASC keywords, i.e. replace them with a blank string. This has to be fast and efficient which means a single, non-backtracking pass.

The first temptation might be to write:

1 while $query =~ s/(ORDER BY.+?)\s+ASC/$1/gmsi;

That works but it’s horribly inefficient because it does worse than backtracking: it restarts. After every match Perl restarts the pattern match from the beginning of the string, looking for ORDER BY. If the string is huge then these restarts absolutely kill performance–I’ll demonstrate this later.

It’s often the case, though, that there’s a point in the string after which all potential replaceable substrings will occur. This allows for a single, non-backtracking, non-restarting pass. The key is to anchor Perl at that point and tell it search and replace forward until the end of the string. This is what the \G anchor is used for:

if ( $query =~ m/\bORDER BY /gi ) {
   1 while $query =~ s/\G(.+?)\s+ASC/$1/gmsi && pos $query;
}

The first condition finds the anchor point if it exists. If it doesn’t, then we’re done, but if it does then we begin an anchored search and replace. For the first iteration, the \G causes Perl to search from where the first condition anchored, i.e. just after ORDER BY. The /g (global match) modifier moves the anchor forward after each match so second and subsequent iterations only search forward. Finally, matching never backtracks and never restarts thanks to && pos $query because normally Perl would retry the whole string after hitting its end. use re "debug" shows this to be the case. Instead, the end of the string will cause the pattern match to fail and pos $query will become false, terminating the loop.

The speed difference is enormous. I created a 216M string with a few hundred thousand occurrences of a keyword after an anchor near the end of the text (anchor pos at 70% total string length) and benchmarked how long it took each approach to remove the keywords after the anchor. The first approach took:

^CCommand terminated by signal 2
882.25user 269.79system 19:14.11elapsed 99%CPU

It took so long that I terminated it after 19 minutes. The second approach took:

16.04user 0.64system 0:16.75elapsed 99%CPU

The first approach obviously suffered from having to re-find the anchor point hundreds of thousands of time, each time discarding the first 70% of the string. Conversely, the second approach discarded the first 70% of the string once it found the anchor point and then did a single forward pass to the end of the string.

For small inputs the difference is probably negligible, but for hundreds of millions of small inputs the difference can add up, and that’s the kind of data set this particular code has to deal with quickly and efficiently.

Are you a hacker?

Posted on June 2nd, 2010 by Daniel Nichter

A stranger once asked me, “Are you a hacker?”, when he saw me coding on my Linux-running laptop. I didn’t know how to answer him, but after seeing only one guy in the OSBridge Hacker Lounge last night at 21:30 hours when I was finally going home I realized that, no, I’m not a hacker. Being in your teens and actually coding all night are requirements for being a “hacker” proper. The rest of us are just professionals who hack it until 22:00 hours at best then go home for a full nights sleep. “Hacker” whatever at conferences amuse me because let’s face it: being young, owlish, and poor of diet and sleep may have gotten us to this point in our lives, but it does not and will not take us further.

Try again

Posted on May 25th, 2010 by Daniel Nichter

svn and Google Code sometimes bork on my commits, saying:

Transmitting file data …..svn: Commit failed (details follow):
svn: Commit failed unexpectedly — please try again later

The first line tells me that the transmission failed and that details follow. The second line tells me that the commit failed unexpectedly, which is essentially what the first line told me. Where are the details? Why did it fail? What was being attempted that failed? Why is the failure “unexpected”? Are there “expected” failures?

Here’s a nice error from the drizzle build process:

configure: error: Couldn’t find uuid/uuid.h. On Debian this can be found in uuid-dev. On Redhat this can be found in e2fsprogs-devel.

When a program fails I want details and I don’t need to be told “try again.” Of course I’m going to try again. In fact, I’m not going to stop trying until it succeeds.

Code documentation

Posted on May 19th, 2010 by Daniel Nichter

I’m guilty of not always documenting my code. Last week I finished up a big project, lots of new code, and by the end I wasn’t documenting the code because I thought, “the code is obvious, just read it.” And while the code may be obvious if one reads it, I realized yesterday that one probably doesn’t want to read it at first. I realized this when I found myself reading code docu/comments Baron put in some new modules for mk-index-usage. He wrote a bunch of code while I was working on my project so when it was my turn to work on mk-index-usage and I needed to bring myself up-to-speed I read Baron’s code docu, not his code. I trust that if Baron’s comment says “do x and y” then the code following does x and y and, until I need to tinker with x and y, I don’t care how x and y are actually done. My code doesn’t afford Baron that luxury because not everything is documented. So poor Baron had to actually read my code. Perhaps the real irony and question is: why would I take time to blog about this but not take time to write code documentation?

Devel::Size::total_size() fix/patch coderef segfault

Posted on April 21st, 2010 by Daniel Nichter

Devel::Size v0.71 and older causes Perl to segfault if you total_size() any non-trivial coderef. This was known as bug 29238 and bug 26781. I fixed/patched this. The patch (for 0.71; probably works on other version, if not, just make the two simple changes manually) is simply:


48c48
<     if ((o->op_type = OP_TRANS)) {
---
>     if ((o->op_type == OP_TRANS)) {
309a310
>     break;

Program in the morning

Posted on April 9th, 2010 by Daniel Nichter

I’m a morning person. 8am is late to me. I’m also a late-evening person. I sometimes work from 8am to 10pm (with breaks for eating). I’ve learned time and time again that programming late-day, or whenever you’re tired, is truly a waste. I spent 7 hours late yesterday trying to perfect ReportFormatter–a rolled-my-own Perl module for generating columnized reports of insanely variable-width data. I was close, but whenever I’d fix X, Y would break, and vice-versa. I forced myself to quit, go home, eat, drink, sleep and in 3 hours this morning I redesigned the module and it works. The design and code is also nicer, unlike the hackish code I was desperately trying to munge into working order yesterday. All that wasted effort for,

Variable                  ...figs/mysqldhelp001.txt ...figs/mysqldhelp002.txt
========================= ========================= =========================
character_sets_dir        /home/daniel/mysql_bin... /usr/share/mysql/chars...
pid_file                  /tmp/12345/data/mysql_... /mnt/data/mysql/sl1.pid
ssl_key                                             /opt/mysql.pdns/.cert/...
report_host               127.0.0.1
log_bin                   mysql-bin                 sl1-bin
innodb_file_per_table     FALSE                     TRUE
datadir                   /tmp/12345/data/          /mnt/data/mysql/

That looks simple, but so do many things which are complex under the hood. It’s amazing how simple it is to do in ones mind. Perhaps that’s why coding such a thing is difficult when one’s mind is fatigued, when the idea-code translations come out as logical as dreams.

First date with Python

Posted on March 26th, 2010 by Daniel Nichter

I began learning Python a few months back by reading Programming in Python 3. I didn’t know at the time that Python 3 was devel and that Python 2 was production. Regardless, I got the chance yesternight to sit down and use Python in the real world. My first languages where BASIC (pre-Microsoft), Pascal and C–primarily the latter. Then I learned C++, PHP, JavaScript and finally Perl some years ago. Now Python.

I think Perl is a perfect language; I absolutely love it. Sure it’s not the fastest in all cases, and it’s memory consumption is epic at times, but Perl is beautiful, expressive and tremendously helpful. It’s been said that Python has a beautiful heart and I may agree because I’m attracted to simplicity. In Perl there’s always a multitude of ways to do things and no one way is “correct”, but I’ve read that in Python there’s one way to do things and that’s the Python way. I’ve also read that “Python is not C” (when I couldn’t figure out why i++ wouldn’t work). Python may be a little obstinate but as André Gide said, “Il vaut mieux d’être détesté pour ce qu’êtes vous que pour être aimé pour ce que n’êtes pas vous.”

It’s no use doing a strict side-by-side comparison because there’s very little similarity between the languages other than they’re higher-level than C. Since I use languages to express business ideas and solve business problems, my concern is efficacy; i.e. does this language help or hinder me? Of course, one’s skill in the language is an important factor, but that aside (because my skill in Python is not even half my skill in Perl), I feel Python itself was helpful. What I found to be a hindrance was the Python documentation.

The docu is fantastically detailed, but that’s also the problem. One of Perl docu’s greatest features is the SYNOPSIS section of modules. A language’s syntax and built-in/basic data types is the trivial part of its learning curve. To do something useful with the language requires modules/classes. At first I don’t care about the details of these things, I just want to know the most basic usage–I want a synopsis. From that I get a feel for the module/class and can intuit how the other stuff is going to work.

Perhaps that’s only a Perl influence due to Perl modules having zero standard interface. To Python’s advantage its classes seem to have a much more consistent look and feel. With more time I’ll probably learn to read pydocs more efficiently. Also, since there’s one correct Python way to do things, I may even achieve Python proficiency more quickly than I did Perl proficiency, which is an eternal struggle when it’s permissible to be lazy or obscure.

It’s odd but that’s what I look forward to discovering with Python: will it permit me to be lazy? C is an awesome language but it’s also a relentless task-master. Perl is both awesome and lazy. Perhaps Python will be somewhere in between.

Copyright © 2009 codenode. Theme by THAT Agency powered by WordPress.