![]() |
I found a bug in E17, how can I provide the E17 developers with some useful information?This is a guide to providing the Enlightenment developers with useful information that they can use to track down a bug you have found. Recompile EFL with Debug SymbolsThe key to getting good debug information from a backtrace, Valgrind or any any other collection method is to compile the Enlightenment Foundation Libraries with debugging symbols. For debugging Enlightenment the following pieces need to be recompiled:
Before recompiling any of the parts of EFL make sure you add -g to your compile flags. This can be done by running the following command: export CFLAGS=-g This will add -g to the compile flags to add debugging symbols. Now for each of the libraries and e do: make clean distclean (or whatever variation on this you have - but remember make clean distclean before re-running configure) Crashing and DebuggingNow everything should have debugging symbols. We can now simply run Enlightenment. There are two ways you can do this, the automatic way or the manual way. The automatic way is to use the xnest.sh script you will find in the source directory for E - e17/apps/e/xnest.sh. More about xnest.sh at the end, first we will describe the manual method. Try to preform the steps needed to reproduce the crash. You will get the "white box of death" that says E segfaulted. We can now go over to a text console (ctrl+alt+f1) and log in. Now you need to attach gdb to e. find out the process ID of enlightenment as shown below. ps auwx | grep enlightenment now type: gdb enlightenment PID Where PID is the process id you found. Gdb will load and stream along for a bit then give you a prompt. You can now debug. First try to use gdb"s backtrace command: (gdb) bt This is the stack trace. It basically means the main() function called ecore_main_loop_begin(), it called _ecore_main_loop_iterate_internal(), and this function called _ecore_main_select(), and that in turn called select() etc. The important bit here is that E has its own segfault handler - it traps its own problems and tries to let you recover (that's what the white box of death is). Lets take a look at the function that was called: #6 0x0808f706 in e_sigseg_act (x=11, info=0x80a9fb0, data=0x80aa030) The e_sigseg_act() function is called when the program segfaults (it is called directly by the kernel interrupting anything e was doing just before it was called - the thing it was doing would have caused the segfault). so that means in this example E segfaulted inside the select() function (frame 7 is an intermediate frame that calls the signal handler). Next we need to get some more info about this crash. We will now go to the stack frame just before the segfault. In this case its stack frame 8. you want a listing of the code there and some info (so we can double check your code there is what we have here too). the gdb commands you then want are:
If you want to get adventurous you should start dumping variable values for us. In this example I can't debug select because its in libc and it is probably not the reason for the crash. We will look to the frame above that, frame 9, to see if any nasty data was being sent to select. (gdb) fr 9 We can see some variables there and function calls - often variables like pointers may be garbage or NULL and thus causing a segv. We can see what they are using the print (p) command, see the example below: (gdb) p ret If the variable is a pointer to something printing it will print the pointer value, not what it points to, what it points to is important. To print that we suggest: p *pointer Example:(gdb) fr 5 As we know its a pointer (Display *) the * means its a pointer to a Display struct/type. The pointer value looks healthy, it is not 0x0 or a very low number, so we can try and look at the data it is pointing to: (gdb) p *dd Nevermind, that's xlib's display struct. It's private and we don't know what's inside - BUT all the types e uses(such as Evas_List) inside that it defines will allow you to do this generally. In general it's a good idea to spend some quality time with gdb and do all this - mail all the output of gdb during one of these "debugging sessions" and then we can sift through it. it may not mean a lot to you, but it means a world to us. Sometimes the stack is screwed and well - nothing you can do. Often this means you need to resort to valgrind to catch things before the stack gets screwed. this gets a bit more intense, BUT you will need to run E under valgrind - allowing gdb to attach. Finding memory problems with ValgrindTo debug using valgrind enlightenment must be run through valgrind. This can be done by executing valgrind in a console as shown below. export E_START="enlightenment_start";valgrind --tool=memcheck --db-attach=yes enlightenment You will need an xserver running for it do display on. The console will need to be usable even if the wm is screwed (so another machine sshing in, a text console etc.). Remember, Valgrind is intercepting all memory operations so it will make things very slow. But it is thorough and can find a lot of difficult to find problems. When you get a problem valgrind will spew and then ask if you want to attach gdb. Often you get a harmless one of these once when you start e - about reading uninitialized memory inside XPutImage - ignore this. its harmless. It will be this: ==7072== Syscall param writev(vector[...]) points to uninitialised byte(s) To ignore the error just say no (n) you may even get it 2 times if you are running multihead. Anything else though is a likely candidate for a problem, when it complains say yes (y) to attach and get us the valgrind AND gdb info (debug in gdb as above). Valgrind may complain a lot when enlightenment shuts down about problems inside exit() these can also be ignored. They look like this: ==7072== If you see this it is valgrind's own internal debugging hooks causing problems. You may need to run valgrind from a console - many people ask how they can do this and debug a wm. Well here is one way, note, you will need root access. sudo X -ac :1 & This will run an empty xserver on :1 and flip to it. You can flip back to your console with ctrl+alt+f1 or where ever the console was. You can flip back to the new xserver with something like ctrl+alt+f8. The other way is to use xnest.sh, this way does not require root access, or the use of virtual terminals (see below). Enlightenment will be running (very slowly) under valgrind. Do whatever it is you do to make the bug happen. When e "locks up" and doesn't seem to move (but the mouse does), flip back to the text console where you ran valgrind from and see if it is complaining (as per above). Using xnest.she17/apps/e/xnest.sh is a script that automates the use of gdb and other debugging tools. It uses Xnest, so that you can run it from within an already running enlightenment, you get a second enlightenment running in a window. Best to run it from a terminal program, so that you can see the output, and not have to switch to virtual terminals. You can run it from anywhere, in this document we run it from the source directory. It wont be in the path though. xnest.sh can show you the list of options if you use - ./xnest.sh --help By default xnest.sh will run enlightenment under the supervision of gdb, so you don't need to find the PID and do the gdb attach thing after it crashes. The default is to automate things as much as possible, so you don't have to tell gdb to run enlightenment, or tell it to display the backtrace after a crash. Just run xnest.sh, do whatever you have to do to crash Enlightenment, then xnest.sh cleans up everything, leaving you with a backtrace you can cut and paste and send to the developers. There are also options to run a non automated gdb session, where you have to tell gdb to start running Enlightenment using the run command, and you have to do your own backtraces, but you get the chance to use more gdb commands for displaying frames and variables. You can do this with raw gdb, or using a couple of wrappers around gdb. You can also use xnest.sh to run Enlightenment under other debugging tools like valgrind, strace, and memprof. It's all quite straight forward and easy to use so long as you have the debugging tools and Xnest installed. On the other hand, it's not a sophisticated script and we are dealing with the debugging of crashing things here. Sometimes it can leave things in a state that it can't recover from. The following commands are helpful if you get into that state - killall -KILL xnest.sh |

