Jump To Content

Username:
Password:

Register
-38d 14h 4m 55s left

Vinnie

Posted by nuxi on 2009-Dec-31 at 02:06:13 in My Life (Login to reply)

So back in 2008 I took care Mary's dog Cleo for a bit. This time I have her other doggy.



Vinnie decided to stretch right as I took that photo. Here is his "Where is Cleo?" face that he's been giving me all day.



And finally, the "when are you going to sleep?" face:

View all 0 comments

Gaming the EC2 pricing system

Posted by nuxi on 2009-Dec-15 at 02:01:41 in Computers (Login to reply)

Amazon EC2 just rolled out a new instance type called a spot instance. So the short version is that each hour Amazon sets a spot price based on current supply and demand. Users of EC2 can have instances run as long as the price is beneath a spot limit that they set.

So I checked the recent spot pricing for an m1.small machine in the us-east zone and its between $0.025 and $0.035 per hour. This same machine as an on-demand instance (the original type) is $0.085. Now Amazon's goal is clearly to monetize their idle CPU time by offering a low cost option for data analysis work. But, if we make one little assumption we can use it to get a discount for on-demand instance.

So what do you think the chances are that Amazon has capped the spot price at the normal instance price? If they did, then you could just convert all your on demand instances to spot instances with the spot limit set to the normal price. That way your instance always stays up and you get a discount when the spot price is lower!

This only works for on demand instances though, a 3-year reserved instance is $0.041/hr and a 1-year reserved instance is
$0.056/hr for that same machine type. Depending on where the market goes, its probably a safer bet to keep those reservations.

Update: On a side note, I wonder if Amazon will actually bill me $0.052 for the two $0.026 instances I started for a moment while playing with it. I've always wanted to write a check to my credit card company for $0.06

View all 0 comments

I want focus follows mouse in OSX

Posted by nuxi on 2009-Dec-13 at 05:17:27 in Computers (Login to reply)

I really wish Apple would allow focus follows mouse in OSX outside of just Terminal.app and X11.app. Even Windows supports it, the setting is available in TweakUI.

Lots of people say I'm crazy when I talk about focus follows mouse and I eventually learn they think focus means which window is on top. So let me clarify:

Focus means which window gets keyboard input.
Focus does not mean which window is on top.

The confusion comes from the fact that OSX and Windows automatically put the window with focus on top, this is actually called autoraise. Autoraise and focus follows mouse are related features that should both be configurable. As you can see in the following screenshot, TweakUI allows autoraise to be enabled or disabled if focus (called activation) follows mouse is enabled.



Why won't Apple give us the ability to enable focus follows mouse and disable autoraise?

View all 0 comments

OSX hacking

Posted by nuxi on 2009-Dec-13 at 04:14:08 in Computers (Login to reply)

The VBox 3.1 changelog noted something about disabling the dock on OSX. So I did some googling, eventually found a shareware app that could also do it. I reverse engineered that and found these:

defaults write com.apple.dock orientation -string bottom
defaults write com.apple.dock pinning -string end
defaults write com.apple.dock autohide -bool true
defaults write com.apple.dock tilesize -float 1e-17


Why would I pay $14.95 for that?

For the curious the legitimate values for those keys are:

defaults write com.apple.dock orientation -string [left|bottom|right]
defaults write com.apple.dock pinning -string [start|middle|end]
defaults write com.apple.dock autohide -bool [true|false]
defaults write com.apple.dock tilesize -integer [1-256]


The OSX Preferences GUI only allows tilesizes from 16 to 128 and doesn't allow setting the pinning value.

View all 0 comments

Open source 3d acceleration

Posted by nuxi on 2009-Dec-13 at 04:09:00 in Computers (Login to reply)

I updated my desktop to the 2.6.32 kernel today and was greated with a pleasent suprise. I have 3d acceleration with the open source radeon X driver now! I stopped using the closed source fglrx driver a few months ago because ATI dropped support my laptop's graphics card. This kind of pissed me off because that card isn't even 3 years old yet.

I'm not sure if it was the new graphics layer in the 2.6.32 kernel or an update to Xorg that I hadn't restarted X for that did it. Regardless of how it happened, it is awesome. I tried enabling KMS support too, but I need the 1.7 release of Xorg for that I'm still on 1.6.

I've been getting my quake3 fix today...

View all 0 comments

IRC bot fun

Posted by nuxi on 2009-Dec-08 at 15:26:41 in Computers (Login to reply)

My friend linked mathematica into an IRC channel, so I started playing with commands like Run[], FilePrint[], etc., You know, the ones that let you run system commands:


14:27 <@mjgardes> if you get your ssh key in there, what will you do with it?
14:27 < NUXI> im not sure
14:28 <@mjgardes> it's behind NAT
14:28 < NUXI> math: "ssh-rsa *snip" >> /home/math/.ssh/authorized_keys

A little bit later (after he hardens it more) something occurs to me:

15:12 < NUXI> math: FilePrint[FromCharacterCode[{33}] <> "bash -c
'echo;/sbin/ifconfig eth0'"]
15:12 < math> eth0 Link encap:Ethernet HWaddr 00:16:d4:dd:e4:c2
15:12 < math> inet addr:192.168.0.37 Bcast:192.168.0.255 Mask:255.255.255.0
15:12 < math> inet6 addr: 2001:5c0:1103:1400:216:d4ff:fedd:e4c2/64
Scope:Global
15:12 < math> inet6 addr: fe80::216:d4ff:fedd:e4c2/64 Scope:Link
15:12 < math> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
15:12 < math> RX packets:1141505 errors:0 dropped:0 overruns:0 frame:0
15:12 <@mjgardes> crud

Isn't that a lovely IPv6 address?

nuxi@nobel:~(130)$ ssh math@2001:5c0:1103:1400:216:d4ff:fedd:e4c2
Last login: Tue Dec 8 13:30:55 EST 2009 from babylon.chroma on pts/8
math@merle ~ $


I win!

View all 0 comments

I <3 Indexes!

Posted by nuxi on 2009-Nov-30 at 21:04:58 in Computers (Login to reply)

My friend recently asked for some assistance optimizing some SQL. You might want to read the code on his page, I don't have a pretty syntax highlighter in my blog. I don't usually use MySQL because I prefer Postgres, but the offer a free beer prompted me to take a peek anyway.

His database schema was as follows:

CREATE TABLE IF NOT EXISTS `MyDatabase`.`TRK` (
`id` int( 11 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`latitude` float NULL ,
`longitude` float NULL ,
`altitude` float NULL ,
`date` date NULL ,
`time` time NULL
) ENGINE = MYISAM DEFAULT CHARSET latin1 AUTO_INCREMENT=1 ;

CREATE UNIQUE INDEX lldt ON `MyDatabase`.`TRK` (`latitude`,`longitude`,`date`,`time`);


And the query that needed help is:
SELECT latitude, longitude, date, time, CONCAT(date, ' ', time) AS trk_dt_0,
'$dat1 $tim1' AS trk_dt_1 FROM TRK WHERE UNIX_TIMESTAMP(CONCAT(date, ' ', time)) >
UNIX_TIMESTAMP('$dat1 $tim1') ORDER BY UNIX_TIMESTAMP(trk_dt_0) ASC LIMIT 1;


Generally speaking the first step of optimization should be ensuring that the WHERE clause of your query is using an index. In this case, the WHERE clause is not even directly working on fields of the database. A short description of how this query runs would be:
1. Retrieve a row from the database
2. Combine the date and time fields
3. Compare them against the specified value
4. If it matches add to the result set
5. Repeat until end of database reached
6. Order the results
7. Select the first one

So the first step in optimizing this query is to just combine date and time fields in the database. The two halves can easily be seperated in the PHP application if necessary.

CREATE TABLE IF NOT EXISTS `MyDatabase`.`TRK` (
`id` int( 11 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`latitude` float NULL ,
`longitude` float NULL ,
`altitude` float NULL ,
`datetime` datetime NULL ,
) ENGINE = MYISAM DEFAULT CHARSET latin1 AUTO_INCREMENT=1 ;

CREATE UNIQUE INDEX lldt ON `MyDatabase`.`TRK` (`latitude`,`longitude`,`datetime`);


Obviously this had to be done through a more complex conversion script for the live database. You would do that by adding the new column, running a script to copy the data from the seperate date and time fields into the combined field, and then delete the two old columns. So the new SQL query looks like this:

SELECT latitude, longitude, datetime, '$dat1 $tim1' AS trk_dt_1 FROM TRK WHERE
datetime > '$dat1 $tim1' ORDER BY datetime ASC LIMIT 1;


At the very least we have less typing for the programmer, but this alone apparently brought the runtime from over 11 hours to 45 minutes! Consider the steps now used:
1. Retrieve a row from the database
2. Compare the datetime against the specified value
3. If it matches add to the result set
4. Repeat until end of database reached
5. Order the results
6. Select the first one

The next step in optimization is to actually add our index to the datetime field.

CREATE UNIQUE INDEX dt ON `MyDatabase`.`TRK` (`datetime`);

Now the WHERE clause of the SQL query operates against the index instead of the full table. Searching an index is much faster than searching the database as a whole. A full row only needs to be retrieved if it is part of the result set.

So now the steps taken by the database engine are:
1. Retrieve a row from the index
2. Compare the datetime against the specified limit
3. If it matches add to full row from the database to the result set
4. Repeat until end of index reached
5. Order the results
6. Select the first one

Another benefit in this case is that our results are also sorted by the index and generally speaking the index itself is sorted...
1. Retrieve a row from the index
2. Compare the datetime against the specified value
3. If it matches, return the full row from the database and exit
4. Repeat until end of index reached

Now the first row that matches will cause the search to just exit. This optimization brought the overall runtime from 45 minutes to 3 seconds! Thats right, it apparently ran in 3 seconds.

My friend Peter suggested another optimization that he has told me about in the past. One SELECT query that returns 10 rows is much faster than 10 SELECT queries that return one row each. The application in this case is actually requesting every row of the database twice! First it requests every database row, and then for each one it requests the row immediately following it chronologically. With a little redesign the SQL query in the loop could be removed:

The redesigned PHP with only a single SQL query

Another minor optimization you can usually perform is defining fields as NOT NULL unless you actually need to store null data. I read a blog post once that claimed NULL was the biggest mistake in SQL design. Whether or not that is true is way above my head, but you can definitely squeak a tiny bit more performance out by not allowing NULL if you don't have to.

CREATE TABLE IF NOT EXISTS `MyDatabase`.`TRK` (
`id` int( 11 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`latitude` float NOT NULL DEFAULT '0',
`longitude` float NOT NULL DEFAULT '0',
`altitude` float NOT NULL DEFAULT '0',
`datetime` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
) ENGINE = MYISAM DEFAULT CHARSET latin1 AUTO_INCREMENT=1 ;



What I've mentioned above is nearly universally true, although creating a lot of different indexes will eventually slow down your database. What I mention below is more in the deep magic category, it may help or hurt your performance. You should always explore the benefits using things like the EXPLAIN command. EXPLAIN is not a standard SQL command, so here is the Postgres and MySQL documentation.

One trick on tables with lots of fields where we regularly need a very small subset of the data is to simply put all the fields in our index. My friend's database already had such an index, but the EXPLAIN command showed it was unused.

CREATE UNIQUE INDEX lldt ON `MyDatabase`.`TRK` (`latitude`,`longitude`, `datetime`);

By redefining this index with datetime first, MySQL will use it when running the search.

CREATE UNIQUE INDEX lldt ON `MyDatabase`.`TRK` (`datetime`, `latitude`,`longitude`);

So now the operation of the database is:
1. Retrieve a row from the index
2. Compare the datetime against the specified value
3. If it matches, return the row from the index and exit
4. Repeat until end of index reached

Combining the two indexes on my friend's database will help the performance of INSERT, UPDATE, and DELETE, but it may end up slowing down SELECT. Only by actually testing it would you be able to tell if it was worth it.

View all 0 comments

So... bad idea...

Posted by nuxi on 2009-Nov-17 at 02:21:01 in SNAFU (Login to reply)

Sure, my IP never changes... until it does.

Lets see bad ideas:
* Hosting the primary DNS for your domains on the end of a cable modem
* Hosting the primary MX server for your domains on the end of a cable modem
* Using a 1 day TTL for your domains on the end of a cable modem
* Using a 4 hour refresh for your domains on the end of a cable modem

Maybe this is yet another sign that I should be using real hosting.


Edit: apparently my backup MX isn't accepting mail for me...

View all 0 comments

Why I hate Exchange

Posted by nuxi on 2009-Nov-12 at 17:35:02 in Computers (Login to reply)

I have a bit of a problem with an MS Exchange server some of my mail is controlled by. The biggest issue is that Exchange is munging the mail contents.

Exchange supports two different mail protocols. (well, 3, but we are going to ignore X.400) The first one is what you might call the native Exchange storage format and protocol, MAPI. MAPI is also the protocol used by MS Outlook to talk to Exchange.

The second protocol is the standard SMTP format message called RFC822, or sometimes called MIME format in the Exchange documentation. This is the format used by SMTP, POP3, and IMAP. This is the format used by Exchange when it needs to talk to the real world.

Exchange 2003 stored messages in two different databases. One database stored mailed that arrived in MAPI format, the other stored mail that arrived in RFC822 format. Mail would be converted on the fly from one format to the other as needed.

Now this conversion from one format to the other causes some loss of information, such a thing is inevitable. So obviously, its important to only do this conversion when necessary right?

Exchange 2007 only stores mail in MAPI format. All incoming mail is converted to MAPI and it is converted back to RFC822 format if needed. So if you are an IMAP client to Exchange 2007, your mail is converted from RFC822 to MAPI and then back to RFC822.

Yeah, that double conversion works about as well as you might expect.

View all 0 comments

WTF bash!

Posted by nuxi on 2009-Nov-04 at 11:56:43 in Computers (Login to reply)

This is the classic asciibetical mode, its always case sensitive:

nuxi@nobel:~/wtf(0)$ echo $LC_COLLATE
C
nuxi@nobel:~/wtf(0)$ ls
Makefile makefile
nuxi@nobel:~/wtf(0)$ ls m*
makefile
nuxi@nobel:~/wtf(0)$ ls [m]*
makefile
nuxi@nobel:~/wtf(0)$ ls [mn]*
makefile
nuxi@nobel:~/wtf(0)$ ls [m-n]*
makefile


Now here is the en_US mode, its sometimes case sensitive and sometimes not:


nuxi@nobel:~/wtf(0)$ echo $LC_COLLATE
en_US.UTF-8
nuxi@nobel:~/wtf(0)$ ls
makefile Makefile
nuxi@nobel:~/wtf(0)$ ls m*
makefile
nuxi@nobel:~/wtf(0)$ ls [m]*
makefile
nuxi@nobel:~/wtf(0)$ ls [mn]*
makefile
nuxi@nobel:~/wtf(0)$ ls [m-n]*
makefile Makefile


And this is why I always set LC_COLLATE to C even when I run Unicode stuff.

View all 0 comments