Oracle is buying Sun. Today on Sunday, I wanted to “read” up on some of those cool jewels that will come with the package, called DTrace. I haven’t been working on Sun for at least 6 years, so I though, hack, lets start with a demo. So I Googled for “dtrace” and it came up with a “DTrace Review” on Google Video from Bryan Cantrill. It discusses the DTrace and how it can be used to significantly improve debugging both for development and live systems.
OK, Cool stuff.
I listened through the whole hour and it got to me, if not the least, because I am currently in the last stages of finishing up some performance profiling for a customer and I realized, being “in the act”, that profiling itself has an mayor impact on how to interpret the data. Profiling comes, in normal circumstances, with some overhead (CPU/IO), so I realized that I had to compensate this effect regarding my timing data.
A while ago a colleague asked me if I would help him interpret a AWR report and if how would be able to say anything about the performance of the database only based on this report. After a quick look on the AWR report, I replied within 2, 3 minutes or so: “Yep I know what the performance is…”. Over a period of 2 days (in between jobs) I guided him along the same think patterns as I had done while interpreting the AWR report. Without going into much detail, what gave it away, was the fact that AWR SQL statements were all on number 1 positions in the top ten “bad SQL” statement lists. I understand that this conclusion is based on a “leap of faith”. ADDM, ASH, AWR, etc should be at least as less intrusife to the database system as possible and therefore should not pop-up on a standard bad performing system in the top ten lists…
The point I want to make here is that I really was amazed with the power of “DTrace”. So I wanted also to see what guy this “Bryan Cantrill” had to say on his blog. What caught my eye was “Catching disk latency in the act“. There, in this post, a guy called Brendan Gregg, a Sun Fishworks (Analytics) engineer can be seen live in action while demonstrating what happens with your latency “if you shout at your disks“…
Apparently (as said on Bryan’s blog) “Brendan actually made this discovery while exploring drive latency that he had seen in a lab machine due to a missing screw on a drive bracket. (!)” Another lesson learnt. Fasten your drive brackets properly. 😉
I had a lot of laughs. I think I re-run it 3 times or so and then it also hit me, if anything defines the term Oracle “ACE”, Brendan demonstrates it here as well.
How did I made that jump?
I had a lot of discussions with Anjo Kolk lately (and a year or two ago, with Doug) about what defines an Oracle ACE (/ACED). There were also once heated discussions about the topic on Howard Rogers Blog site at the time Oracle actually defined the criteria, but as far as I know, the blog post from Howard about the subject and especially the comments from the community, were lost the moment Howard closed down that site.
Anyway, seeing Brendan Gregg (doing his real time demo) did me realize again what (at least) defines for me an (Oracle) ACE (nominated, awarded or not), the qualities needed:
- Passion (with a capital “P”)
- Fun in what he is doing, trying to achieve
- Being Knowledgeable (about hard-, software)
- Sharing the info, his knowledge
- Out of the Box thinking ( – Eh, Against the Box Screaming – )
If people like Bryan Cantrill and Brendan Gregg is what Sun (also) brought on the table, closing the deal – Then yeah, “Thank You!” – Oracle/Sun.
😎
When performance problem hits, make sure nobody’s shouting at the disks …. 😉
I was thinking more along the lines that Kevin (Closson) and Brendan could now sing a duet in the Sun Exadata datacenter
🙂