Total Pageviews

Friday, 26 January 2018

linux vps中的“Device or resource busy” 的解决办法

Quick Start

If you just want enough information to fix your problem quickly, you can read the How-To section of this post and skip the rest. I would highly recommend reading everything though, as a good understanding of the concepts and commands outlined here will serve you well in the future. We also have Video and Audio included with this post that may be a good quick reference for you. Don’t forget that the man and info pages of your Linux/Unix system can be an invaluable resource as well when you’re trying to solve problems.

Preface

To make things easier on you, all of the black command line and script areas are set up so that you can copy the text from them. This does make using the commands easier, but if you’re not already familiar with the concepts presented here, typing the commands yourself and working through why you’re typing them will help you learn more. If you hit problems along the way, take a look at the Troubleshooting section near the end of this post for help.
There are formatting conventions that are used throughout this post that you should be aware of. The following is a list outlining the color and font formats used.
Command Name or Directory Path
Warning or Error
Command Line Snippet With Commands/Options/Arguments
Command Options and Their Arguments Only
Hyperlink

Overview

When you try to access an object on a Linux file system that is in use, you may get an error telling you that the device or resource you want is busy. When this happens, you may see a message like the one in Listing 1.
Listing 1
$ sudo umount /media/4278-62C2/ umount: /media/4278-62C2: device is busy. (In some cases useful info about processes that use the device is found by lsof(8) or fuser(1))
Notice that there are 2 commands specified at the end of the output – lsof and fuser, which are the two commands that this post will be focused on.

Introducing lsof

lsof is used to LiSt Open Files, hence the command’s name. It’s a handy tool normally used to list the open files on a system along with the associated processes or users, and can also be used to gather information on your system’s network connections. When run without options, lsof lists all open files along with all of the active processes that have them open. To get a full and accurate view of what files are open by what processes, make sure that you run the lsof command with root privileges.
To use lsof on a specific file, you have to specify the full path to the file. Remember that everything in Linux is a file, so you can use lsof on anything from directories to devices. This makes lsof a very powerful tool once you’ve learned it.
There are many options for lsof, and I have listed summaries for the ones that I find most useful in Listing 2. Anything with square brackets around it (“[” and “]“) is an argument to the option, and a pipe (“|“) means that you can choose one of two alternatives ([4|6] means choose 4 or 6).
Listing 2
+d [directory] Scans the specified directory and all directories/files in its top level to see if any are open. +D [directory] Scans the specified directory and all directories/files in it recursively to see if any are open. -F [characters] Allows you to specify a list of characters used to split the output up into fields to make it easier to process. Type lsof -F ? for a list of characters. -i [address] Shows the current user's network connections and the processes associated with them. Connection types can be specified via an argument: [4|6][protocol][@hostname|hostaddr][:service|port] -N Enables the scanning/listing of files on NFS mounts. -r [seconds] Causes lsof to repeat it's scan indefinitely or every so many seconds. +r [seconds] A variation of the -r option that will exit on the first iteration when no open files are listed. It uses seconds as a delay value. -t Strips all data out of the output except the PIDs. This is good for scripts and piping data around. -u [user|UID] Allows you to show the open files for the user or user ID that you specify. -w Causes warning messages to be suppressed. Make sure that the warnings are harmless before you suppress them.
If you are extra security conscious, have a look at the SECURITY section of the lsof man page. There are 3 main issues that the developers of lsof feel may be security caveats. Many distributions have addressed at least some of these security concerns already, but it doesn’t hurt to understand them yourself.

Introducing fuser

By default fuser just gives you the PIDs of processes that have a file open on a system. The PIDs are accompanied by a single character that represents the type of access that the process is performing on that file (f=open file, m=memory mapped file or shared library, c=current directory, etc). If you want output that’s somewhat similar to the lsof command, you can add the -v option for verbose output. According to the man page, this formats the output in a “ps-like” style. To get a full and accurate view of what files are open by all processes, make sure that you run fuser with root privileges. Listing 3 holds some of the fuser options that I find most useful.
Listing 3
-i Used with the -k option, it prompts the user before killing each process. -k Attempts to kill all processes that are accessing the specified file. -m Shows the users and processes accessing any file within a mounted file system. -s Silent mode where no output is shown. This is useful if you only want to check the exit code of fuser in a script to see if it was successful. -u Appends the user name associated with each process to each PID in the output. -v Gives a "ps-like" output format that is somewhat similar to the default lsof output.
fuser is supposed to be a little lighter weight than lsof when it comes to using your system resources. To get an idea of what “a little” meant, I ran some very quick tests on both of the commands. I found that fuser consistently took only 30% – 50% of the time that it took lsof to run the same scan, but used about the same amount of RAM (within 5%). My tests were quick and dirty using the ps and time commands, so your mileage may vary. In any event very few users, if any, will notice a performance difference between the two commands because they use such a small amount of system resources.

How-To

Hopefully by the point you’re reading this section you either have, or are beginning to get a pretty good understanding of both the lsof and fuser commands. Either one of them can be used to solve device and/or resource busy errors in Linux. Let’s take a look at a few scenarios.
Say that I have mounted a CD to /media/cdrom0, used it for awhile copying files from it, and now want to unmount it. The problem is that Linux won’t let me unmount the CD. I get the familiar error in Listing 4, but you can see that I then use lsof and fuser to track down what’s going on.
Listing 4
$ sudo umount /media/cdrom0 umount: /media/cdrom0: device is busy. (In some cases useful info about processes that use the device is found by lsof(8) or fuser(1)) $ sudo lsof -w /media/cdrom0 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME bash 2238 jwright cwd DIR 11,0 4096 1600 /media/cdrom0/boot $ sudo fuser -mu /media/cdrom0 /media/cdrom0: 2238c(jwright)
Both commands tell me what PID is accessing the file system mounted on /media/cdrom0 (2238). Each of the two commands also tells me that the process is using a directory within the /media/cdrom0 file system as it’s current working directory. This is shown as the cwd specifier in the lsof output, and the letter c in the output of fuser (appended to the PID). Finally, each of the commands tells me that a process I (jwright) started is using the directory, and lsof goes one step further in telling me the exact directory the process (listed as bash in the COMMAND column) is using as its current working directory.
Armed with this information, I start searching around and find that I have a virtual terminal open in which I used the cd command to descend into the /media/cdrom0/boot directory. I have to change to a directory outside of the mounted file system or exit that virtual terminal for the umount command to succeed. This example uses a simple oversight on my part to illustrate the point, but many times the process holding the file open is going to be outside of your direct control. In that case you have to decide whether or not to contact the user who owns the process and/or kill the process to release the file. Be careful when killing processes without contacting your users though, as it can cause the user who is accessing the file/directory some major problems.
Another scenario is something that has happened to me when running Arch Linux. At seemingly random intervals, MPlayer (run from the command line) would refuse to output sound and started complaining that the resource /dev/dsp was busy and that it couldn’t open /dev/snd/pcmC0D0p. Listing 5 shows an excerpt from the error MPlayer was giving me, and Listing 6 is the output that I got from running the lsof command on /dev/snd/pcmC0D0p.
Listing 5
[AO OSS] audio_setup: Can't open audio device /dev/dsp: Device or resource busy [AO_ALSA] alsa-lib: pcm_hw.c:1325:(snd_pcm_hw_open) open /dev/snd/pcmC0D0p failed: Device or resource busy
Listing 6
$ lsof /dev/snd/pcmC0D0p COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME firefox 4398 jwright mem CHR 116,6 5818 /dev/snd/pcmC0D0p firefox 4398 jwright 84u CHR 116,6 0t0 5818 /dev/snd/pcmC0D0p exe 4534 jwright mem CHR 116,6 5818 /dev/snd/pcmC0D0p exe 4534 jwright 37u CHR 116,6 0t0 5818 /dev/snd/pcmC0D0p
After doing some research, I found that the exe process was associated with the version of the Google Chrome browser that I was running and with it’s use of Flash player. I closed Firefox and Chrome and then tested MPlayer again, but still didn’t have any sound. I then ran the same lsof command again and noticed that the exe process was still there, apparently hung. I killed the exe process and was then able to get sound out of MPlayer immediately.
Through this investigation I found that the problem was not truly random, but occurred whenever Chrome came in contact with a Flash movie with sound. The silent MPlayer problem only seemed random because I was not accessing Flash movies with sound at consistent intervals. Now I’m not meaning to pick on Arch Linux here, because the problem seems to have been present in other distributions as well. Also, I have been unable to reproduce this problem on newer versions of Google Chrome running on Arch Linux, telling me that the issue has probably been resolved.
Listing 7 shows a basic example of how you might use the lsof command to track what services/processes are using the libwrap (TCP Wrappers) library. Keep in mind that the | head -4 text at the end of the command line just selects the first 4 lines of output.
Listing 7
$ lsof /lib/libwrap.so.0 | head -4 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME pulseaudi 1690 jwright mem REG 8,1 30960 668 /lib/libwrap.so.0.7.6 gconf-hel 1693 jwright mem REG 8,1 30960 668 /lib/libwrap.so.0.7.6 gnome-set 1703 jwright mem REG 8,1 30960 668 /lib/libwrap.so.0.7.6
If you wanted to get a full system-wide view of the processes using libwrap, you would run the command with sudo or by issuing the su command (I recommend using sudo instead thought).
Carrying this example further, we could add the -i option to display the network connection information as well (Listing 8). The TCP argument to the option tells lsof that we want to only look at TCP connections, excluding other connections like UDP. This is a good way study the services that are currently being protected by the TCP Wrappers mechanism. Please note that this command may take some time to complete.
Listing 8
$ lsof -i TCP /lib/libwrap.so.0 | head -10 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME pulseaudi 1675 jwright mem REG 8,1 30960 668 /lib/libwrap.so.0.7.6 gconf-hel 1678 jwright mem REG 8,1 30960 668 /lib/libwrap.so.0.7.6 gnome-set 1690 jwright mem REG 8,1 30960 668 /lib/libwrap.so.0.7.6 metacity 1719 jwright mem REG 8,1 30960 668 /lib/libwrap.so.0.7.6 gnome-vol 1770 jwright mem REG 8,1 30960 668 /lib/libwrap.so.0.7.6 firefox 1909 jwright mem REG 8,1 30960 668 /lib/libwrap.so.0.7.6 chrome 1992 jwright 59u IPv4 101427 0t0 TCP topbuntu.local:42427->iy-in-f83.1e100.net:https (ESTABLISHED) chrome 1992 jwright 61u IPv4 124360 0t0 TCP topbuntu.local:40761->208.69.36.231:https (CLOSE_WAIT) chrome 1992 jwright 68u IPv4 12636 0t0 TCP topbuntu.local:35689->iy-in-f18.1e100.net:https (ESTABLISHED)
By using the -t option, you receive output from lsof that can then be passed to another command like kill. Listing 9 shows that I have opened a file with two instances of tail -f so that tail will keep the file open and update me on any data that is appended to it. Listing 10 shows a quick way to terminate both of the tail processes in one shot using the -t option and back-ticks.
Listing 9
$ lsof /tmp/testfile.txt COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME tail 10784 jwright 3r REG 8,1 0 16282 /tmp/testfile.txt tail 10792 jwright 3r REG 8,1 0 16282 /tmp/testfile.txt
Listing 10
$ kill `lsof -t /tmp/testfile.txt`
If you haven’t seen back-ticks (`) used before in the shell, it probably looks a little strange to you. The back-ticks in this instance tell the shell to execute the command between them, and then replace the back-ticked command with the output. So, for Listing 10 the section of the line within the back-ticks would be replaced by the list of PIDs that are accessing /tmp/testfile.txt. These PIDs are passed to the kill command which sends SIGTERM to each instance of tail, causing them to exit.
An alternative to this would be what you see in Listing 11, where you see the -k and -i options of the fuser command used to interactively kill both instances to tail.
Listing 11
$ fuser -ki /tmp/testfile.txt /tmp/testfile.txt: 11106 11107 Kill process 11106 ? (y/N) y Kill process 11107 ? (y/N) y

Tips and Tricks

  • Don’t use the -k option with fuser without checking to see which processes it will kill first. The easiest way to do this is by using the -ki option combination so that fuser will prompt you before killing the processes (see Listing 11). You can specify a signal other than SIGKILL to send to a process with the -SIGNAL argument to the -k option.
  • As mentioned above, the -r option of lsof causes it to repeat its scan every so many seconds, or indefinitely. This can be very useful when you are writing a script that may need to call lsof repeatedly because it avoids the wasted overhead of starting the command from scratch each time.
  • lsof functionality is supposed to be fairly standard across the Linux and Unix landscape, so using lsof in your scripts can be an advantage when you’re shooting for portability.
  • When you are using fuser to check who or what is using a mounted file system, add the -m option to your command line. By doing this, you tell fuser to list the users/processes that have files open in the entire file system, not just the directory you specify. This will prevent you from being confused when fuser doesn’t give you any information even though you know the mounted file system is in use. So, you would issue a command that’s something like
    sudo fuser -mu /media/cdrom
    to save you that trouble. You still don’t know which subdirectory or file is being held open, but this is easily solved by using the +D option with lsof to search the mounted file system recursively.
    sudo lsof +D /media/cdrom/

Scripting

These scripts are somewhat simplified and in most cases could be done other ways too, but they will work to illustrate the concepts. If you use these scripts, make sure you adapt them to your situation. Never run a script or command without understanding what it will do to your system.
For the first scripting example, lets say that it’s 5:00 and you need to leave for the day, but you also have to delete a shared configuration file that’s still being used by several people. Presumably the configuration file will be automatically recreated when someone needs it next. The script shown in Listing 12 shows one way of taking care of the file deletion while still leaving on time, and it uses lsof. This assumes for the sake of the example that every system that has access to the shared configuration file releases it when users are done and logout for the night. Make sure to run this script with root privileges or it might not see everyone that’s using the file before deleting it, causing a mess.
Listing 12
#!/bin/bash - # Check every 30 seconds to see everyone is done with the file lsof +r 30 /tmp/testfile.txt > /dev/null 2>&1 # We've made it past the lsof line, so we must be ok to delete the file rm /tmp/testfile.txt
You end up with a very quick and simple script that doesn’t require a continuous while loop, or a cron job to finish its task.
Another example would be using fuser to make a decision in a script. The script could check to see if a preferred resource is in use and move on to the next one if it is. Listing 13 shows an example of such a script.
Listing 13
#!/bin/bash - # Make sure to run this script with root privileges or it # may not work. # Set up a counter to track which console we are checking COUNTER=0 # Loop until we find an unused virtual console or run out of consoles while true do # Check to see if any user/process is using the virtual console fuser -s /dev/tty$COUNTER # Check to see if we've found an unused virtual console if [ $? -ne 0 ] then echo "The first unused virtual console is" /dev/tty$COUNTER break fi # Get ready to check the next virtual console COUNTER=$((COUNTER+1)) # Try to get a listing of the virtual console we are checking ls /dev/tty$COUNTER > /dev/null 2>&1 # Check to see if we've run out of virtual consoles to check. # The ls command won't return anything if the file doesn't exist. if [ $? -ne 0 ] then echo "No unused virtual console was found." break fi done
This script loops through all of the virtual console device files (/dev/tty*) and looks for one that fuser says is unused. Notice that I’m checking the exit code of both fuser and ls via the built-in variable $?, which holds the exit status of the last command that was run.
That’s just a small sampling of what you can do with lsof and fuser within scripts. There are any number of ways to improve and expand upon the scripts that I’ve given in Listing 12 and Listing 13. Having an in-depth knowledge of the commands will open up a lot of possibilities for your scripts and even for your general use of the shell.

Troubleshooting

Every time that I try to run the lsof command on my Ubuntu 9.10 machine with administrative privileges, I get the following warning:
lsof: WARNING: can't stat() fuse.gvfs-fuse-daemon file system /home/jwright/.gvfs
This warning occurs when lsof tries to access the Gnome Virtual File System (gvfs), which is (among other things) a foundational part of Gnome’s Nautilus file manager. lsof is warning you that it doesn’t have the ability to look inside of the virtual file system and so it’s output may not contain every relevant file. This warning should be harmless, and can be suppressed with the -w option.
Listing 14
$ sudo lsof | head -3 lsof: WARNING: can't stat() fuse.gvfs-fuse-daemon file system /home/jwright/.gvfs Output information may be incomplete. COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME init 1 root cwd DIR 8,1 4096 2 / init 1 root rtd DIR 8,1 4096 2 /
becomes something like this…
Listing 15
$ sudo lsof -w | head -3 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME init 1 root cwd DIR 8,1 4096 2 / init 1 root rtd DIR 8,1 4096 2 /
If lsof stops for a long time, you might need to use some of the “Precautionary Options” listed in the Apple Quickstart Guide in the Resources section. The lsof man page also has a group of sections which start at BLOCKS AND TIMEOUTS that may help you.

Conclusion

There’s a whole host of possibilities for the lsof and fuser commands beyond what I’ve mentioned here, but hopefully I’ve given you a good start. As with so many other things, the time you put into mastering your Linux system will pay you back again and again. If you have any information to add to what I’ve said here, feel free to drop a line in the comments section or send us an email.

Resources

  1. Apple’s Quickstart Guide For lsof
  2. A Good Practical lsof Reference By Philippe Hanrigou
  3. Undeleting Files With lsof and cp
  4. Using fuser To Deal With Device Busy Errors
  5. A Good Reference On fuser (Geared Toward Solaris) By Sandra Henry-Stocker
  6. What To Do With An lsof Gnome Virtual File System (gvfs) Error
  7. LPIC-1 : Linux Professional Institute Certification Study Guide By Roderick W. Smith
from http://innovationsts.com/?p=658

No comments:

Post a Comment