`
| |
 |
 |
| Recent Thoughts
| Tags
|
|
|
|
Friday, May 18th, 2012
You might think Definition of Done (DoD) is a brilliant idea from the Agile world…but the dirty little secret is… its just a hand-over from the waterfall era.
While the DoD thought-process is helpful, it can lead to certain unwanted behavior in your team. For example:
- DoD usually ends up being a measure of output, but rarely it focuses on outcome.
- In some teams, I’ve seen it disrupt true collaboration and instead encourage more of a contractual and “cover my @ss” mentality.
- DoD creates a false-sense/illusion of doneness. Unless you have real data showing users actually benefiting and using the feature/story, how can we say its done?
- I’ve also seen teams gold-plating stuff in the name of DoD. DoD encourages a all-or-nothing approach. Teams are forced to build fully sophisticated features/stories. We might not even be sure if those features/stories are really required or not.
- It get harder to practice iterative & incremental approach to develop features. DoD does not encourage experimenting with different sophistication levels of a feature.
I would much rather prefer the team members to truly collaborate on an-ongoing basis. Build features in an iterative and incremental fashion. Strongly focus on Simplicity (maximizing the amount of work NOT done.) IME Continuous Deployment is a great practice to drive some of this behavior.
Posted in Agile, Organizational | 3 Comments »
Tuesday, November 1st, 2011
“Release Early, Release Often” is a proven mantra, but what happens when you push this practice to it’s limits? .i.e. deploying latest code changes to the production servers every time a developer checks-in code?
At Industrial Logic, developers are deploying code dozens of times a day, rapidly responding to their customers and reducing their “code inventory”.
This talk will demonstrate our approach, deployment architecture, tools and culture needed for CD and how at Industrial Logic, we gradually got there.
Process/Mechanics
This will be a 60 mins interactive talk with a demo. Also has a small group activity as an icebreaker.
Key takeaway: When we started about 2 years ago, it felt like it was a huge step to achieve CD. Almost a all or nothing. Over the next 6 months we were able to break down the problem and achieve CD in baby steps. I think that approach we took to CD is a key take away from this session.
Talk Outline
- Context Setting: Need for Continuous Integration (3 mins)
- Next steps to CI (2 mins)
- Intro to Continuous Deployment (5 mins)
- Demo of CD at Freeset (for Content Delivery on Web) (10 mins) – a quick, live walk thru of how the deployment and servers are set up
- Benefits of CD (5 mins)
- Demo of CD for Industrial Logic’s eLearning (15 mins) – a detailed walk thru of our evolution and live demo of the steps that take place during our CD process
- Zero Downtime deployment (10 mins)
- CD’s Impact on Team Culture (5 mins)
- Q&A (5 mins)
Target Audience
- CTO
- Architect
- Tech Lead
- Developers
- Operations
Context
Industrial Logic’s eLearning context? number of changes, developers, customers , etc…?
Industrial Logic’s eLearning has rich multi-media interactive content delivered over the web. Our eLearning modules (called Albums) has pictures & text, videos, quizes, programming exercises (labs) in 5 different programming languages, packing system to validate & produce the labs, plugins for different IDEs on different platforms to record programming sessions, analysis engine to score student’s lab work in different languages, commenting system, reporting system to generate different kind of student reports, etc.
We have 2 kinds of changes, eLearning platform changes (requires updating code or configuration) or content changes (either code or any other multi-media changes.) This is managed by 5 distributed contributors.
On an average we’ve seen about 12 check-ins per day.
Our customers are developers, managers and L&D teams from companies like Google, GE Energy, HP, EMC, Philips, and many other fortune 100 companies. Our customers have very high expectations from our side. We have to demonstrate what we preach.
Learning outcomes
- General Architectural considerations for CD
- Tools and Cultural change required to embrace CD
- How to achieve Zero-downtime deploys (including databases)
- How to slice work (stories) such that something is deployable and usable very early on
- How to build different visibility levels such that new/experimental features are only visible to subset of users
- What Delivery tests do
- You should walk away with some good ideas of how your company can practice CD
Slides from Previous Talks
Posted in Agile, Continuous Deployment, Deployment, Lean Startup, Product Development, Testing, Tools | No Comments »
Sunday, March 6th, 2011
Recently TV tweeted saying:
Is “measure twice, cut once” an #agile value? Why shouldn’t it be – it is more fundamental than agile.
To which I responded saying:
“measure twice, cut once” makes sense when cost of a mistake & rework is huge. In software that’s not the case if done in small, safe steps. A feedback centric method like #agile can help reduce the cost of rework. Helping you #FailFast and create opportunities for #SafeFailExperiements. (Extremely important for innovation.)
To step back a little, the proverb “measure twice and cut once” in carpentry literally mean:
“One should double-check one’s measurements for accuracy before cutting a piece of wood; otherwise it may be necessary to cut again, wasting time and material.”
Speaking more figuratively it means “Plan and prepare in a careful, thorough manner before taking action.”
Unfortunately many software teams literally take this advice as
“Let’s spend a few solid months carefully planning, estimating and designing software upfront, so we can avoid rework and last minute surprise.”
However after doing all that, they realize it was not worth it. Best case they delivered something useful to end users with about 40% rework. Worst case they never delivered or delivered something buggy that does not meet user’s needs. But what about the opportunity cost?
Why does this happen?
Humphrey’s law says: “Users will not know exactly what they want until they see it (may be not even then).”
So how can we plan (measure twice) when its not clear what exactly our users want (even if we can pretend that we understand our user’s needs)?
How can we plan for uncertainty?
IMHO you can’t plan for uncertainty. You respond to uncertainty by inspecting and adapting. You learn by deliberately conducting many safe-fail experiments.
What is Safe-Fail Experimentation?
Safe-fail experimentation is a learning and problem solving technique which emphasizes on conducting many simultaneous, small, controlled experiments with small variations. Since these are small controlled experiments, failure is an expected & acceptable outcome.
In the software world, spiking, low-fi-prototypes, set-based design, continuous deployment, A/B Testing, etc. are all forms of safe-fail experiments.
Generally we like to start with something really small (but end-to-end) and rapidly build on it using user feedback and personal experience. Embracing Simplicity (“maximizing the amount of work not done”) is critical as well. You frequently cut small pieces, integrate the whole and see if its aligned with user’s needs. If not, the cost of rework is very small. Embrace small #SafeFail experiments to really innovate.
Or as Kerry says:
“Perhaps the fundamental point is that in software development the best way of measuring is to cut.”
Also strongly recommend you read the Basic principles of safe-fail experimentation.
Posted in Agile, Continuous Deployment, Planning | No Comments »
Sunday, January 30th, 2011
How good are you at limiting red time? .i.e. apply limiting WIP (Work-In-Progress) concept to Programming and Product Development.
What is Red Time?
- During Test Driven Development and Refactoring, time taken to fix compilation errors and/or failing tests.
- While Programming, time taken to get the logic right for a sub-set of the problem.
- While Deploying, downtime experienced by users
- While Integrating, time spent fixing broken builds
- While Planning and Designing, time spent before the user can use the first mini-version of the product
- And so on…
Basically time spent outside the safe, manageable state.
Let it be planning, programming or deploying, a growing group of practitioners have learned how to effectively reduce red time.
For example, there are many:
- Refactoring Strategies which can help you reduce your red time by keeping you in a state where you can take really safe steps to ensure the tests are always running.
- Zero-Downtime Deployment which helps you deploy new versions of the product without your customers experiencing any downtime.
- Continuous Deployment which helps you get a change made to code straight to your customers as efficiently as possible
- Lean Start-up techniques which helps validate business hypothesis in a safe, rapid and lean manner.
- And so on…
I highly recommend watching Joshua Kerievsky’s video on Limited Red Society to gain his insights.
Over the years we’ve realized that it always helps to have simple tools to visualize your red time. Visualization helps you understand what’s happening better. And that helps in proactively finding ways to minimize red time.
At Industrial Logic we have a new product called Sessions which helps you visualize your programming session. It highlights your red time.
Posted in Agile, Continuous Deployment, Lean Startup, post modern agile, Product Development, Programming | No Comments »
Tuesday, November 24th, 2009
Lets assume you have a simple web application which runs on a web server like tomcat, jetty, IIS or mongrel and is backed by a database. Also lets say you have only one instance of your application running (non-clustered) in production.
Now you want to deploy your application several times a week. The single biggest issue that gets in the way of continuous deployment is, every time you deploy a new version of your application, you don’t want a downtime (destroy your user’s session). In this blog, I’ll describe how to deploy your applications without interrupting the user.
First time set-up steps:
- On your local machine set up a web server cluster for session replication and ensure your application works fine in a clustered environment. (Tips on setting up a tomcat cluster of session replication). You might want to look at all the objects you are storing in you session and whether they are serializable or not.
- On your production server, set up another web server instance. We’ll call this temp_webserver. Make sure the temp_webserver runs on a different port than your production server. (In tomcat update the ports in the tomcat/config/server.xml file). Also for now, don’t enable clustering yet.
- In your browser access the temp_webserver (different port) and make sure everything is working as expected. Usually both the port on which the production web server and the temp_webserver is running should be blocked and not accessible directly from any other machine. In such cases, set up an SSH-tunnel on the specified port to access the webapp in your browser. (ssh -L 3333:your.domain.com:web_server_port username@server_ip_or_name). Alternatively you could SSH to the production box and use Lynx (text browser) to test your webapp.
- Now enable clustering on both web servers, start them and make sure the session is replicated. To test session replication, bring up one webserver instance, login, then bring up the other instance, now bring down the first instance and make sure your app does not prompt you to login again. Wait a sec! When you brought down the first server, you get a 404 Page not found. Of course, even though clustering might be working fine, your browser has no way to know about the other instance of web server, which is running on a different port. It expects a webserver on the production server’s port.
- To solve this problem, we’ll have to set up a reverse-proxy server like Nginx on your production box or any of your other publically accessible server. You will have to configure the reverse proxy server to run on the port on which your web server was running and change your webserver to run on a different (more secure) port. The reverse proxy server will listen on the required port and proxy all web requests to your server. (sample Nginx Configuration). This will help us start and stop one of our webservers without the user noticing it. Also notice that its a good practice to let your reverse proxy server serve all static content. Its usually a magnitude faster.
- After setting up a round robin reverse proxy, you should be able to test your application in a clustered environment.
- Once you know your webapp works fine in a clustered env in production, you can change the reverse-proxy configuration to direct all traffic to just your actual production webserver. You can comment out the temp_webserver line to ensure only production webserver is getting all requests. (Every time you make a change to your reverse proxy setting, you’ll have to reload the configuration or restart the reverse proxy server. Which usually takes a fraction of a second.)
- Now un-deploy the application on the temp_webserver and stop the temp_webserver. Everything should continue working as before.
- * At each step of this process, its handy to run a battery of functional tests (Selenium or Sahi) to make sure that your application is actually work the way you expect it. Manual testing is not sustainable and scalable.
This concludes our initial set-up. We have enabled ourselves to do continuous deployment without interrupting the user.
Note: Even though our web-server is clustered for session replication, we are still using the same database on both instances.
Now lets see what steps we need to take when we want to deploy a new version of our application.
- FTP the latest web app archive (war) to the production server.
- If you have made any Database changes follow Owen’s advice on Zero-Downtime Database Deployment. This will help you upgrade the DB without affecting the existing, running production app.
- Next bring up the temp_webserver and deploy the latest web application. In most cases, its just a matter of dropping the web archive in the web apps folder.
- Set up a SSH-Proxy from your machine to access the temp_webserver. Run all your smoke tests to make sure the new version of the web-app works fine.
- Go back into your reverse proxy configuration and comment out the production webserver line and uncomment the temp_webserver line. Reload/Restart your reverse proxy, now all request should be redirected to temp_webserver. Since your reverse proxy does not hold any state, reloading/restarting it should not make any difference. Also since your sessions are replicated in the cluster, users should see no difference, except that now they are working on the latest version of your web app.
- Now undeploy the old version and deploy the latest version of your web app on the production webserver. Bring it up and test it using a SSH_proxy from your local machine.
- Once you know the production web-server is up and running on the latest version of your app, comment out the temp_webserver and uncomment the production webserver in the reverse proxy setting . Reload the configuration or restart the reverse proxy. Now all traffic should get redirected to your production web server.
- At this point the temp_webserver has done its job. Its time to undeploy the application and stop the temp_webserver.
Congrats, you have just upgraded your web application to the latest version without interrupting your users.
Note: All the above steps are very trivial to automate using a script. Because of the speed and accuracy, I would bet all my money on the automated script.
Posted in Continuous Deployment, Deployment, Tips | 1 Comment »
Sunday, October 18th, 2009
It appears to me that the Agile Community is falling behind the innovation curve. At conferences, user groups, mailing list, etc, we see the same old same old stuff (may be I’m missing something). So where is the real innovation happening? What space should I be watching?
These were the questions I posed to the group @ the SDTConf 2009. Later, during our discussion at the conference we tried answering them. After a wonderful discussion we come up with some suggestions:
- Web 2.0
- Highly Scalability, Performance and Operations space
- No SQL
- Continuous Deployment and Monitoring space – Owen’s Slides are a good starting point
- Watch out for conferences like O’Reilly’s Velocity
- Alternative Language (non-mainstream languages) space. Lot of interesting experiments going on in
- Dynamic language space
- Functional language space
- Hybrid language space
- Domain Specific Language space
- Could Computing, Parallel Computing (Grid Computing), Virtualization space
- Code Harvesting Space – Check out Test Driven Code Search and Code Genie as a starting point
- Complex Adaptive Systems and its implication on our social interactions space. Dave Snowden’s work is a good starting point
- eLearning and visual assessments (feedback) of a programming session. Check out Visualizing Proficiency
- Polyglot Programming space
- With Google Apps, people are able to build 100s of Apps each month and get instant feedback on their ideas
- Social Networking and Second Life space
- Conference: Lot of interesting experiments are been conducted in the conference space. Conferences have evolved to something very different from before.
- Distributed Development and Remote Pairing space
If you would like to contribute to this list, please add your point on the SDTConf Wiki.
Posted in Agile, Community, Conference, Tips | 4 Comments »
Wednesday, June 21st, 2006
What is the purpose of Continuous Integration (CI)?
To avoid last minute integration surprises. CI tries to break the integration process into small, frequent steps to avoid big bang integration as it leads to integration nightmare.
If people are afraid to check-in frequently, your Continuous Integration process is not working.
CI process goes hand in hand with Collective Code Ownership and Single-Team attitude.
CI is the manifestation of “Stop the Line” culture from Lean Manufacturing.
What are the advantages of Continuous Integration?
- Helps to improve the quality of the software and reduce the risk by giving quicker feedback.
- Experience shows that a huge number of bugs are introduced during the last-minute code integration under panic conditions.
- Brings the team together. Helps to build collaborative teams.
- Gives a level of confidence to checkin code more frequently that was once not there.
- Helps to maintain the latest version of the code base in always shippable state. (for testing, demo, or release purposes)
- Encourages lose coupling and evolutionary design.
- Increase visibility and acts as an information radiator for the team.
- By integrating frequently, it helps us avoid huge integration effort in the end.
- Helps you visualize various trends about your source code. Can be a great starting point to improve your development process.
Is Continuous Integration the same as Continuous build?
No, continuous build only checks if the code compiles and links correctly. Continuous Integration goes beyond just compiling.
- It executes a battery of unit and functional tests to verify that the latest version of the source code is still functional.
- It runs a collection of source code analysis tools to give you feedback about the Quality of the source code.
- It executes you packing script to make sure, the application can be packaged and installed.
Of course, both CI and CB should:
- track changes,
- archive and visualize build results and
- intelligently publish/notify the results to the team.
How do you differentiate between Frequent Versus Continuous Integration?
Continuous means:
- As soon as there is something new to build, its built automatically. You want to fail-fast and get this feedback as rapidly as possible.
- When it stops becoming an event (ceremony) and becomes a behavior (habit).
Merge a little at a time to avoid the big cost at full integration at the end of a project. The bottom line is fail-fast & quicker feedback.
Can Continuous Integration be manual?
Manual Continuous Integration is the practice of frequently integrating with other team members’ code manually on developer’s machine or an independent machine.
Because people are not good at being consistent and cannot do repetitive tasks (its a machine’s job), IMHO, this process should be automated so that you are compiling, testing, inspecting and responding to feedback.
What are the Pre-Requisites for Continuous Integration?
This is a grey area. Here a quick list is:
- Common source code repository
- Source Control Management tool
- Automated Build scripts
- Automated tests
- Feedback mechanism
- Commit code frequently
- Change of developer mentality, .i.e. desire to get rapid feedback and increase visibility.
What are the various steps in the Continuous Integration build?
- pulling the source from the SCM
- generating source (if you are using code generation)
- compiling source
- executing unit tests
- run static code analysis tools – project size, coding convention violation checker, dependency analysis, cyclomatic complexity, etc.
- generate version control usage trends
- generate documentation
- setup the environment (pre build)
- set up third party dependency. Example: run database migration scripts
- packaging
- deployment
- run various regression tests: smoke, integration, functional and performance test
- run dynamic code analysis tools – code coverage, dead-code analyzer,
- create and test installer
- restore the environment (post build)
- publishing build artifact
- report/publish status of the build
- update historical record of the build
- build metrics – timings
- gather auditing information (i.e. why, who)
- labeling the repository
- trigger dependent builds
Who are the stakeholders of the Continuous Integration build?
- Developers
- Testers [QA]
- Analysts/Subject Matter Experts
- Managers
- System Operations
- Architects
- DBAs
- UX Team
- Agile/CI Coach
What is the scope of QA?
They help the team with automating the functional tests. They pick up the product from the nightly build and do other types of testing.
For Ex: Exploratory testing, Mutation testing, Some System tests which are hard to automate.
What are the different types of builds that make Continuous Integration and what are they based on?
We break down the CI build into different builds depending on their scope & time of feedback cycle and the target audience.
1. Local Developer build :
1.a. Job: Retains the environment. Only compiles and tests locally changed code (incremental).
1.b. Feedback: less than 5 mins.
1.c. Stakeholders: Developer pair who runs the build
1.d. Frequency: Before checking in code
1.e. Where: On developer workstation/laptop
2. Smoke build :
2.a. Job: Compiles , Unit test , Automated acceptance and Smoke tests on a clean environment[including database].
2.b. Feedback: less than 10 to 15 mins. (If it takes longer, then you could make the build incremental, not start on a clean environment)
2.c. Stakeholders: All the developers within a single team.
2.d. Frequency: With every checkin
2.e. Where: On a team’s dedicated continuous integration server. [Multiple modules can share the server, if they have parallel builds]
3. Functional build :
3.a. Job: Compiles , Unit test , Automated acceptance and All Functional\Regression tests on a clean environment. Stubs/Mocks out other modules or systems.
3.b. Feedback: less than 1 hour.
3.c. Stakeholders: Developers , QA , Analysts in a given team
3.d. Frequency: Every 2 to 3 hours
3.e. Where: On a team’s dedicated continuous integration server.
4. Cross module build :
4.a. Job: If your project has multiple teams, each working on a separate module, this build integrates those modules and runs the functional build across all those modules.
4.b. Feedback: in less than 4 hr.
4.c. Stakeholders: Developers , QA , Architects , Manager , Analyst across the module team
4.d. Frequency: 2 to 3 times a day
4.e. Where: On a continuous integration server owned by all the modules. [Different from above]
5. Product build :
5.a. Job: Integrates all the code that is required to create a single product. Nothing is mocked or stubbed. [Except things that are not yet built]. Creates all the artifacts and publishes a deployable product.
5.b. Feedback: less than 10 hrs.
5.c. Stakeholders: Every one including the Project Management.
5.d. Frequency: Every night.
5.e. Where: On a continuous integration server owned by all the modules. [Same as above]
General Rule of Thumb: No silver bullet. Adapt your own process/practice.
Posted in Agile, Continuous Deployment, Learning, Metrics, Organizational | No Comments »
|