Who needs DBA Skills? [#TSQL2sDay]

      14 Comments on Who needs DBA Skills? [#TSQL2sDay]
T-SQL Tuesday for November 2010

T-SQL Tuesday for November 2010

Databases are doing just fine. If an issue is encountered add more hardware to the mix. If that does not work then add more servers and move databases to it. No DBAs on site, just developers and a business owner that wants to save money.

What? With that first paragraph I can pinpoint at least 3 main reasons why DBA skills are required:

  1. Adding more hardware is just a tactical approach, never strategical as the new one will encounter the same issue the old one did in a matter of months.
  2. Adding more servers can be complex as databases will reside on different boxes. Messing around with linked servers and executing joins across boxes are not the best approach.
  3. Licensing costs go through the roof! Yes. I have dealt with a solution where the Lead Developer decided to add 8 Servers running SQL Server 2005 Enterprise Edition each. The tab was running over $300K just in licensing costs.

Ok, so what else will a good DBA perform to add value to any company? These come to mind:

  1. To plan a long term solution properly. Architect a database structure in order to be flexible to extensions, following 3rd and 4th normalization forms, with proper relationships, keys and indexes.
  2. To understand the business requirements and design a database layout that will approach them. Which tables will be transactional? Which ones will be historical? Which ones will receive bulk insert operations?
  3. To implement proper security.
  4. If data will grow too large, to prepare the underlying tables to support sliding windows to swap out old data.
  5. To make the business owners aware that a separate entity would be beneficial if reporting is crucial. Nothing better than replicating data to another instance and allowing it be queried all day long.
  6. To make developers understand that business logic shall not run in the database tier. Data validation, cleansing, and any process that will require scalar UDFs are bad. If there is no way to add an app tier, then CLRs can help, but these kind of solutions need to have proper justifications.
  7. To see if filtered and/or include indexes can benefit a solution.
  8. To justify if data compression can be a good trade off with CPU cycles.
  9. To suggest when mirroring, log shipping, replication and/or clustering will benefit.
  10. To understand if I/O will require additional spindles and how to prove it.
  11. To understand if an edition downgrade, release upgrade, virtualization or X64 implementation will add value to any solution.
  12. To have an up to date checklist for production database implementations
  13. To follow a good process: Unit Test, QA, UAT, Production
  14. To learn from the best and to share his knowledge with their team.

DBA skills are not just a nice-to-have. They are a requirement. Do it right the first time and avoid the hassle of fixing in the future.

Once I was approached by a developer and he literally told me:

Every night you should have a job that truncates the logs. Also, you need to shrink the databases to make them faster.

I am not making that up! It is a true story. Imagine if that developer was taking care of the databases. I believe that most developers have no business in the DB world.

  • Pingback: Tweets that mention Who needs DBA Skills? [#TSQL2sDay] | Oscar Zamora -- Topsy.com()

  • Rahulchudiwal

    Hi Oscar for the benefit of developer can you please explain what is wrong in what the developer said about truncating log and shrinking db

    • The point is that the developer was thinking that shrinking databases and truncating logs was going to make the database faster and minimize locks that let the application to time out (SQL 2000 box). And based on the wrong thinking he was telling the DBA what needed to be done in order to fix the problem.

      Now to address your first question. Truncating a log will effectively render useless any subsequent log backups. So a recovery would have been possible until the last good log backup before the truncation, or if a full backup was taken after the truncation (assuming that the instance never crashed).

      Shrinking a database will effectively reduce footprint at the expense of defragmented datafiles. A table scan will become very costly as the heads will need to go back and forth for each chunk of blocks that need to be read instead of just a clean sequential read, hence increasing I/O stalls dramatically.

      Either steps should never be approached.

  • 6. To make developers understand that business logic shall not run in the database tier. Data validation, cleansing, and any process that will require scalar UDFs are bad. If there is no way to add an app tier, then CLRs can help, but these kind of solutions need to have proper justifications.

    Why? That seems to be an awfully broad statement. Would it not be better to discuss the problem with the developers?

    Data is, in most cases, business critical. The database (be it Oracle or MSSQL) rarely changes, but the apps which access that data do change…frequently (relatively speaking of course). Why wouldn’t business logic be closer to the data? If you change the app, then it’s a relatively simple port from platform a to platform b.

    And if you think platform a is going to be the only application accessing your data…well, you know the answer to that. Why repeat, multiple times possibly, what you can do one time in the database?


    • I hate to agree with Chet in public, but I’ll make an exception this time. At some point, your users will decide to bypass the application and go straight to the database. It’s at least worth a discussion or a ‘it depends’

      • I’m totally screenshotting this to show off to all 4 of my friends.

      • Users have no access to production databases. They got access to replicated environments and analytical databases to run queries all day long. They can perform partial matches and multi join queries. We monitor and if we find out that a process is very inefficient, then we will recommend a better way.

        To the point of business logic on the db side, there are certain processes that can benefit from running at the db engine level as explained to @oraclenerd. It is worth a discussion with key resources.

    • The requirement is discussed between the team lead and the developers. If the application needs to access data by any form then a dba or db architect needs to be involved.

      The database rarely changes (I assumed you meant structure), yes, but the business logic can and will change. I am not talking about CRUD or search operations, but processes that run before they hit the DB. For example, if there is a requirement that a search box with autocomplete needs to be developed, the developer can tend to code a process that queries the database everytime the user types in a character. That can work well with 10 or 50 users, but with 1000 you effectively bring the database to its knees. The application tier needs to query and cache the data and then serve the autocomplete request based on the cache. Need to scale? No problem, just add more app servers.

      There are certain operations that can benefit from the db engine, like bulk data comparison for example. We got several anti-semi joins that compare millions of records to satisfy certain business requirement. It would be impractical to move that piece to an app tier.

      • The scenario you describe there isn’t real business logic, it’s a GUI “tool” of sorts. It makes the user experience better. That process would be something for the NoSQL folks out there, get it once, cache it, move along. You’re right, perfect for the middle tier.

        I define business logic essentially as a transaction, not a single INSERT/UPDATE/DELETE mind you, but a grouping of those to accomplish a specific task. The closer that kind of thing is to the data, the better.

        There is typically more than one platform that access your data(base), put your logic there and you only have to write it once. Put it in the middle, and you get GIGO.

        • By your definition, if a transaction performs several crud operations atomically then it is a good candidate to live inside the database. If the operation requires services that connects to other peers in order to request data (e.g. avs code) then it shall live in a middle tier.

          I always think about minimizing locking and not wasting cpu cycles that the rdbms engine can take advantage of. Not all business logic needs to live in a mid tier, that is something that the architects need to work on.

          My point on the post was due to the majority of developers putting all the code within a sproc. Typical ltrim, rtrim and then substring joins and %like% partial matches to accomplish certain CRUD operation. All the cleansing needs to happen outside the db and then passed in clean for exact match join, etc.

          • If you depend on the application developers to cleanse the data…you are in for a world of hurt. I hope that you are taking a great advantage of constraints and putting as many as humanly possible on your tables.

            I agree though that app devs could reduce the amount of functions used in joins as you mention, but that would be caught during basic code reviews. My database, don’t abuse it. 🙂

          • No, they do cleanse input strings (like search values for example) through the application. ETL layer cleansing, that is another silo and got a team for that.

            All new database development is checked and the good thing is that our developers understand that the db is ours and can’t abuse it.

          • Btw, thanks for the comments.

  • Pingback: T-SQL Tuesday #012 : Summary of why DBA skills are necessary | Paul S. Randal()