Montag, 22. August 2011

How to debug Linux Init Scripts (in Ubuntu) which fail at bootime

How to debug Linux Init Scripts (Ubuntu).

Often you have the problem that you hacked an init script or have to modify an existing one which will work whenever you manually start it but the process seems not be there after rebooting.

The problem is that your system is not in normal "user mode" during bootime but uses a slim version of your system. For example in most cases for me it is always that my script relies on environment variables such those I usally set in /etc/environment which are loaded after executing the init scripts.

But here is my actual troubleshooting for init scripts which dont work at bootime (manually executing often works):

1) you havent set up any runlevels for your script (solution: update-rc.d)
2) you havent made them executable (solution: chmod +x)
3) your script relies on environment variables which are not set at this stage of bootime
4) your script also does not work at manually running - this meens you have general problems with it - script ERRORS -use bashdb tool

So what to do if it does not run at bootime but runs manually?

Answer: simulate a bootime environment. You can do this by using the env command [1]

cd /
env -i LANG="$LANG" PATH="$PATH" TERM="$TERM" /etc/init.d/daemon start

this simulates the absolut exact situation which are at boottime!!!!

But often if you run modern scripts it does not output anything also you run it in a bootime-like environment.
So what I usally do I use the bashdebugger bashdb tool [2]. In Ubuntu it is easy to install because it is in the repository under "bashdb".

So now I run the full thing like

cd /
env -i LANG="$LANG" PATH="$PATH" TERM="$TERM" bashdb /etc/init.d/daemon start

and it works like a charme to see whats going on.

Often now I can trace down errors but in some cases it was not possible to get the errornous output. Often this is because the error lies in lines which contain the "start-stop-daemon" program which often modern init scripts rely on. But a solution is near [3]. Just single step to the line in bashdb which contains the start-stop-daemon execution and print it out in the shell e.g. the line is called:

"start-stop-daemon -S -p/var/run/ -cjetty -d/path/to/solr -b -m -a /usr/bin/java -- -Dsolr.solr.home=/path/to/solr/solr -Djetty.logs=/path/to/solr/logs -Djetty.home=/path/to/solr -jar /path/to/solr/start.jar --daemon"

so in bashdb i hop to the line, hold on at there and use the command "x" :

x start-stop-daemon -S -p"$JETTY_PID" $CH_USER -d"$JETTY_HOME" -b -m -a "$JAVA" -- "${RUN_ARGS[@]}" --daemon

and it will print me out the translated string code:

sudo start-stop-daemon -S -p/var/run/ -cjetty -d/path/to/solr -v -b -m -a /usr/bin/java -- -Dsolr.solr.home=/path/to/solr/solr -Djetty.logs=/path/to/solr/logs -Djetty.home=/path/to/solr -jar /path/to/solr/start.jar --daemon

so I usally just execut this line then in the shell but often I cannot see any output again:-(. But using this blogpost here [3] I see that I have to get rid of the -b parameter in order to see the output.

sudo start-stop-daemon -S -p/var/run/ -cjetty -d/path/to/solr -v -m -a /usr/bin/java -- -Dsolr.solr.home=/path/to/solr/solr -Djetty.logs=/path/to/solr/logs -Djetty.home=/path/to/solr -jar /path/to/solr/start.jar --daemon

So in my case the JAVA_HOME was set in /etc/environment but could not be seen by this script so it did not know where JAVA_HOME was. So I putted the JAVA_HOME path in the top of the init script and everything worked fine then.


1 Kommentar:

Anonym hat gesagt…

Thanks for this; bashdb helped me debug a non-starting init.d script