Select Git revision
mcmc_based_searches.py
smartctl.8 22.12 KiB
\# Copyright (C) 2002 Bruce Allen <smartmontools-support@lists.sourceforge.net>
\#
\# $Id: smartctl.8,v 1.31 2003/01/01 08:21:00 ballen4705 Exp $
\#
\# This program is free software; you can redistribute it and/or modify it
\# under the terms of the GNU General Public License as published by the Free
\# Software Foundation; either version 2, or (at your option) any later
\# version.
\#
\# You should have received a copy of the GNU General Public License (for
\# example COPYING); if not, write to the Free Software Foundation, Inc., 675
\# Mass Ave, Cambridge, MA 02139, USA.
\#
\# This code was originally developed as a Senior Thesis by Michael Cornwell
\# at the Concurrent Systems Laboratory (now part of the Storage Systems
\# Research Center), Jack Baskin School of Engineering, University of
\# California, Santa Cruz. http://ssrc.soe.ucsc.edu/
\#
.TH SMARTCTL 8 "$Date: 2003/01/01 08:21:00 $" "smartmontools-5.0"
.SH NAME
smartctl \- S.M.A.R.T. control and monitor utility
.SH SYNOPSIS
.B smartctl [options] device
.SH DESCRIPTION
.B smartctl
controls the Self-Monitoring, Analysis and Reporting Technology
(S.M.A.R.T.) system built into many ATA-3 and later ATA, IDE and
SCSI-3 hard drives. The purpose of S.M.A.R.T. is to monitor the
reliability of the hard drive and predict drive failures, and to carry
out different types of drive self-tests. This version of smartctl is
compatible with ATA/ATAPI-5 and earlier standards (see REFERENCES
below)
.B smartctl
is a command line utility designed to perform S.M.A.R.T. tasks such as
printing the S.M.A.R.T. self-test and error logs, and enabling and
disabling S.M.A.R.T. automatic testing. Note: if the user issues a
S.M.A.R.T. command that is (apparently) not implemented by the device,
we print a warning message but issue the command anyway. This should
not cause problems: unimplemented S.M.A.R.T. commands issued to a
drive are ignored and return an error.
.B smartctl
also provides limited TapeAlerts support for some SCSI tape drives and
changers.
The user must specify the device to be controlled or interrogated as an
argument to
.B smartctl.
ATA devices use the form "/dev/hd*" and SCSI devices use the form "/dev/sd*".
For SCSI Tape Drives and Changers with TapeAlerts support use the devices
"/dev/st*" and "/dev/sg*". More general paths may also be specified.
.B smartctl
will attempt to guess the device type, but the '\-d' option can be used to
specify a device type of ATA or SCSI if required.
.PP
.SH OPTIONS
.PP
The options are grouped below into several categories.
.B smartctl
will execute these in the order: INFORMATION, ENABLE/DISABLE, DISPLAY
DATA, RUN/ABORT TESTS.
SCSI devices only accept the options
.B \-h, \-?, \-V, \-i, \-a, \-d, \-s, \-H, \-t, \-C
and
.B \-X.
TapeAlerts devices only accept the options
.B \-h, \-?, \-V, \-i, \-a, \-d, \-s
and
.B \-H.
Long options are not supported on all systems. Use
.B 'smartd \-h'
to see the available options.
.TP
.B SHOW INFORMATION:
.TP
.B \-h, \-\-help, \-\-usage
Prints a usage message and exits.
.TP
.B \-?
Same as
.B \-h.
.TP
.B \-V, \-\-version, \-\-copyright, \-\-license
Prints version, copyright, license, home page and CVS-id information for your
copy of
.B smartctl.
Please include this information if you are reporting bugs or problems.
.TP
.B \-i, \-\-info
Prints the disk model number, serial number, firmware version, and ATA Standard
version/revision information. Says if the device supports S.M.A.R.T., and if
so, whether S.M.A.R.T. support is currently enabled or disabled.
.TP
.B \-a, \-\-all
Prints all S.M.A.R.T. information about the disk. This is equivalent to '\-H
\-i \-c \-A \-l error \-l selftest' (for SCSI, '\-H \-i').
.TP
.B RUN-TIME BEHAVIOR:
.TP
.B \-q TYPE, \-\-quietmode=TYPE
Specifies that
.B smartctl
should run in one of the two quiet modes described here. The valid arguments
to this option are:
.I errorsonly
\- only print: For the '\-l error' option, if nonzero, the number
of errors recorded in the SMART error log and the power-on time when
they occurred; For the '\-l selftest' option, errors recorded in the device
self-test log; For the '\-H' option, SMART "disk failing" status or device
attributes (pre-failure or usage) which failed either now or in the
past; For the '\-A' option, device attributes (pre-failure or usage)
which failed either now or in the past.
.I silent
\- print no output. The only way to learn about what was
found is to use the exit status of
.B smartctl
(see RETURN VALUES below).
.TP
.B \-d TYPE, \-\-device=TYPE
Specifies the type of the device. The valid arguments to this option are
.I ata
and
.I scsi.
If this option is not used then
.B smartctl
will attempt to guess the device type from the device name.
.TP
.B \-T TYPE, \-\-tolerance=TYPE
Specifies how tolerant
.B smartctl
should be of S.M.A.R.T. command failures. The valid arguments to this option
are:
.I normal
\- exit on failure of a mandatory S.M.A.R.T. command, but not on failure of an
optional S.M.A.R.T. command. This is the default.
.I conservative
\- exit on failure of any S.M.A.R.T. command.
.I permissive
\- ignore failure of any S.M.A.R.T. command.
Here "mandatory" means "required by the ATA/ATAPI-5 Specification if the
device implements the S.M.A.R.T. command set" and "optional" means "not
required by the ATA/ATAPI-5 Specification even if the device implements
the S.M.A.R.T. command set." The 'mandatory' S.M.A.R.T. commands are: (1)
Enable/Disable Attribute Autosave, (2) Enable/Disable S.M.A.R.T., and (3)
S.M.A.R.T. Return Status.
.TP
.B \-b TYPE, \-\-badsum=TYPE
Specifies the action
.B smartctl
should take if a checksum error is detected in the: (1) Device
Identity Structure, (2) S.M.A.R.T. Self-Test Log Structure, (3)
S.M.A.R.T. Attribute Value Structure, (4) S.M.A.R.T. Attribute
Threshold Structure, or (5) ATA Error Log Structure.
The valid arguments to this option are:
.I warn
\- report the incorrect checksum but carry on in spite of it. This is the
default.
.I exit
\- exit
.B smartctl.
.I ignore
\- continue silently without issuing a warning.
.TP
.B S.M.A.R.T. FEATURE ENABLE/DISABLE COMMANDS:
.IP
.B Note:
if multiple options are used to both enable and disable a
feature, then
.B both
the enable and disable commands will be issued. The enable command
will always be issued
.B before
the corresponding disable command.
.TP
.B \-s VALUE, \-\-smart=VALUE
Enables or disables S.M.A.R.T. on device. The valid arguments to
this option are
.I on
and
.I off.
Note that the command '\-s on' (perhaps
used with with the '\-o on' and '\-S on' options) should be placed in a
start-up script for your machine, for example in rc.local or rc.sysinit.
In principle the S.M.A.R.T. feature settings are preserved over
power-cycling, but it doesn't hurt to be sure.
.TP
.B \-o VALUE, \-\-offlineauto=VALUE
Enables or disables S.M.A.R.T. automatic offline test, which scans the drive
every four hours for disk defects. This command can be given during normal
system operation. The valid arguments to this option are
.I on
and
.I off.
Note that the S.M.A.R.T. automatic offline test command is listed as 'Obsolete'
in every version of the ATA and ATA/ATAPI Specifications
that I can find. However it is implemented and used by some
vendors. [Good documentation can be found in IBM's Official
Published Disk Specifications. For example the IBM Travelstar 40GNX
Hard Disk Drive Specifications (Revision 1.1, 22 April 2002,
Publication # 1541, Document S07N-7715-02) page 164.]
S.M.A.R.T. provides
.B three basic categories of testing.
The
.B first category,
called 'online' testing, has no effect on the performance of
the device. It is turned on by the '\-s on' option.
The
.B second category of testing
is called 'offline' testing. This type
of test can, in principle, degrade the device performance. The '\-o on'
option causes this offline testing to be carried out, automatically,
on a regular scheduled basis. Normally, the disk will suspend any
offline testing while disk accesses are taking place, then
automatically resume them when the disk would otherwise be idle, so in
practice it has little effect. Note that a one-time offline test can
also be carried out immediately upon receipt of a user command. See
the '\-t offline' option below, which causes a one-time offline test to be
carried out immediately.
Any errors detected in automatic or immediate offline testing will be
shown in the S.M.A.R.T. error log, and will be reflected in the values
of the S.M.A.R.T. attributes. These are visible with the '\-l error' and '\-A' options.
The
.B third category of testing
is the 'self' testing. This third type of
test is only performed (immediately) when a command to run it is
issued. The '\-t' and '\-X' options can be used to carry out and abort such
self-tests; please see below for further details.
Any errors detected in the self testing will be shown in the
S.M.A.R.T. self-test log, which can be examined using the '\-l selftest'
option.
.B Note:
in this manual page, the word
.B "Test"
is used in connection with the second category
just described, e.g. for the 'offline' testing. The words
.B "Self-test"
are used in connection with the third category.
.TP
.B \-S VALUE, \-\-saveauto=VALUE
Enables or disables S.M.A.R.T. autosave of device vendor-specific
attributes. The valid arguments to this option are
.I on
and
.I off.
Note that this feature is preserved across disk power cycles, so you should only
need to issue it once.
.TP
.B S.M.A.R.T. READ AND DISPLAY DATA OPTIONS:
.TP
.B \-H, \-\-health
Check: Ask the device to report its S.M.A.R.T. health status. It does
this using information that it has gathered from online and offline
tests, which were used to determine/update its
S.M.A.R.T. vendor-specific attribute values.
If the device reports failing health status, this means
.B either
that the device has already failed,
.B or
that it is predicting its own failure within the next 24 hours. If
this happens, use the '\-a' option to get more information, and
.B get your data off the disk and someplace safe as soon as you can.
.TP
.B \-c, \-\-capabilities
Prints only the generic S.M.A.R.T. capabilities. These show
what S.M.A.R.T. features are implemented and how the device will
respond to some of the different S.M.A.R.T. commands. For example it
shows if the device logs errors, if it supports offline surface
scanning, and so on. If the device can carry out self-tests, this
option also shows the estimated time required to run those tests.
Note that the time required to run the Self-tests (listed in minutes)
are fixed. However the time required to run the Immediate Offline
Test (listed in seconds) is variable. This means that if you issue a
command to perform an Immediate Offline test with the '\-t offline' option,
then the time may jump to a larger value and then count down as the
Immediate Offline Test is carried out. Please see REFERENCES below
for further information about the the flags and capabilities described
by this option.
.TP
.B \-A, \-\-attributes
Prints only the vendor specific S.M.A.R.T. attributes. The
attributes are numbered from 1 to 253 and have specific names. For
example attribute 12 is 'power cycle count': how many times has the
disk been powered up. Each attribute has a 'Raw' value, printed under
the heading 'Raw Value', and a 'Normalized' value printed under the
heading 'Value'. [Note:
.B smartctl
prints these values in base-10.]
Each vendor uses their own magic to convert the Raw
value to a Normalized value. If the Normalized value is
.B less than or equal to
the value given under the 'Threshold' column, then disk failure
is imminent. The column labeled 'Worst' shows the lowest (closest to
failure) value that the disk has recorded at any time during its
lifetime when S.M.A.R.T. was enabled.
Note that the conversion from 'Raw' value to physical units is not
specified by the S.M.A.R.T. standard. In most cases, the values printed by
.B smartctl
are sensible. However in some cases a vendor uses unusual
conventions. For example the Hitachi disk on my laptop reports its
power-on hours in minutes, not hours. Some IBM disks track three
temperatures rather than one, in their raw values. And so on.
The table printed out by this option also shows the 'Type' of the
attribute. Pre-failure attributes are ones which, if less than or
equal to their threshold values, indicate pending disk failure. Old
age, or usage attributes, are ones which indicate end-of-product life
from old-age or normal aging and wearout, if the attribute value is
less than or equal to the threshold.
If the attribute's current value is <= threshold, then the 'Ever
failed' column will display 'FAILED NOW!'. If not, but the worst
recorded value is <= threshold, then this column will display 'In the
past'.
Note that starting with ATA/ATAPI-4, revision 4, the meaning of these
attribute fields has been made entirely vendor-specific. However most
ATA/ATAPI-5 disks seem to respect their meaning, so we have retained
this option.
.TP
.B \-l TYPE, \-\-log=TYPE
Prints either the S.M.A.R.T. error log or the S.M.A.R.T. self-test log. The
valid arguments to this option are:
.I error
\- prints only the S.M.A.R.T. error log. S.M.A.R.T. disks maintain
a log of the most recent five non-trivial errors. For each of these
errors, the disk power-on lifetime at which the error occurred is
recorded, as is the device status (idle, standby, etc) at the time of
the error. Finally, up to the last five commands that preceded the
error are also recorded, along with a timestamp measured in seconds
from when the disk was powered up during the session where the error
took place. [Note: this time stamp wraps after 2^32 milliseconds, or
49 days 17 hours 2 minutes and 47.296 seconds.]
The key ATA disk registers are also recorded in the log.
.I selftest
\- prints only the S.M.A.R.T. self-test log. The disk maintains a
log showing the results of the self tests, which can be run using
the '\-t' option described below. The log will show, for each of
the most recent twenty-one self-tests, the type of
test (short or extended, off-line or captive) and the final status of
the test. If the test did not complete successfully, the percentage
of the test remaining is show. The time at which the test took place,
measured in hours of disk lifetime, is shown. If any errors were
detected, the Logical Block Address (LBA) of the first error is printed
in hexadecimal notation.
.TP
.B \-v N,OPTION, \-\-vendorattribute=N,OPTION
Sets a vendor-specific display OPTION for attribute N. There is currently only
one valid argument to this option:
.I 9,minutes
\- the disk stores Raw Attribute number 9 (power on time) in
minutes rather than hours, so divide by 60 before displaying it.
.TP
.B S.M.A.R.T. RUN/ABORT OFFLINE TEST AND SELF-TEST OPTIONS:
.TP
.B \-t TEST, \-\-test=TEST
Executes TEST immediately. The '\-C' option can be used in conjunction
with this option to run the short or long self-tests in captive mode.
Note that only one test can be run at a time, so this option should only
be used once per command line.
The valid arguments to this option are:
.I offline
\- runs S.M.A.R.T. Immediate Offline Test. This immediately
starts the test described above. This command can be given during
normal system operation. The effects of this test are visible only in
that it updates the S.M.A.R.T. attribute values, and if errors are
found they will appear in the S.M.A.R.T. error log, visible with the '\-l error'
option.
If the '\-c' option to
.B smartctl
shows that the device has the "Suspend Offline collection upon new
command" capability then you can track the progress of the Immediate Offline
test using the '\-c' option to
.B smartctl.
If the '\-c' option show that the device has the "Abort Offline
collection upon new command" capability then most commands will abort
the Immediate Offline Test, so you should not try to track the
progress of the test with '\-c', as it will abort the test.
.I short
\- runs S.M.A.R.T. Short Self Test (usually under ten minutes).
This command can be given during normal system operation (unless run in
captive mode \- see the '\-C' option below). This is a
test in a different category than the immediate or automatic offline
tests. The 'Self' tests check the electrical and mechanical
performance as well as the read performance of the disk. Their
results are reported in the Self Test Error Log, readable with
the '\-l selftest' option. Note that on some disks the progress of the
test can be monitored by watching this log during the test; with other disks
use the '\-c' option to monitor progress.
.I long
\- runs S.M.A.R.T. Extended Self Test (tens of minutes). This is a
longer and more thorough version of the Short Self Test described
above. Note that this command can be given during normal
system operation (unless run in captive mode \- see the '\-C' option below).
.TP
.B \-C, \-\-captive
With '\-t short' or '\-t long', runs the self-test in captive mode. This has
no effect with '\-t offline' or if the '\-t' option is not used.
.B WARNING: Tests run in captive mode may busy out the drive for the length
.B of the test. Only run this on drives without any mounted partitions.
.TP
.B \-X, \-\-abort
Aborts non-captive S.M.A.R.T. Self Tests. Note that this
command will abort the Offline Immediate Test routine only if your
disk has the "Abort Offline collection upon new command" capability.
.PP
.SH EXAMPLES
.nf
.B smartctl \-a /dev/hda
.fi
Print all S.M.A.R.T. information for drive /dev/hda (Primary Master).
.PP
.nf
.B smartctl \-s off /dev/hdd
.fi
Disable S.M.A.R.T. on drive /dev/hdd (Secondary Slave).
.PP
.nf
.B smartctl \-\-smart=on \-\-offlineauto=on \-\-saveauto=on /dev/hda
.fi
Enable S.M.A.R.T. on drive /dev/hda, enable automatic offline
testing every four hours, and enable autosaving of
S.M.A.R.T. attributes. This is a good start-up line for your system's
init files. You can issue this command on a running system.
.PP
.nf
.B smartctl \-t long /dev/hdc
.fi
Begin an extended self-test of drive /dev/hdc. You can issue this
command on a running system. The results can be seen in the self-test
log visible with the '\-l selftest' option after it has completed.
.PP
.nf
.B smartctl \-s on \-t offline /dev/hda
.fi
Enable S.M.A.R.T. on the disk, and begin an immediate offline test of
drive /dev/hda. You can issue this command on a running system. The
results are only used to update the S.M.A.R.T. attributes, visible
with the '\-A' option. If any device errors occur, they are logged to
the S.M.A.R.T. error log, which can be seen with the '\-l error' option.
.PP
.nf
.B smartctl \-A \-v 9,minutes /dev/hda
.fi
Shows the vendor attributes, when the disk stores its power-on time
internally in minutes rather than hours.
.PP
.nf
.B smartctl \-q errorsonly \-H \-l selftest /dev/hda
.fi
Produces output only if the device returns failing S.M.A.R.T. status,
or if some of the logged self-tests ended with errors.
.PP
.nf
.B smartctl \-q silent \-a /dev/hda
.fi
Examine all S.M.A.R.T. data for device /dev/hda, but produce no
printed output. You must use the exit status (the
.B $?
shell variable) to learn if any attributes are out of bound, if the
S.M.A.R.T. status is failing, if there are errors recorded in the
self-test log, or if there are errors recorded in the disk error log.
.PP
.SH RETURN VALUES
The return values of smartctl are defined by a bitmask. For the
moment this only works on ATA disks. The different bits in the return
value are as follows:
.TP
.B Bit 0:
Command line did not parse.
.TP
.B Bit 1:
Device open failed, or device did not return an IDENTIFY DEVICE structure.
.TP
.B Bit 2:
Some SMART command to the disk failed, or there was a checksum error
in a SMART data structure (see '\-b' option above).
.TP
.B Bit 3:
SMART status check returned "DISK FAILING".
.TP
.B Bit 4:
SMART status check returned "DISK OK" but we found prefail attributes <= threshold.
.TP
.B Bit 5:
SMART status check returned "DISK OK" but we found that some (usage
or prefail) attributes have been <= threshold at some time in the
past.
.TP
.B Bit 6:
The device error log contains records of errors.
.TP
.B Bit 7:
The device self-test log contains records of errors.
To test within the shell for whether or not the different bits are
turned on or off, you can use the following type of construction (this
is bash syntax):
.nf
.B smartstat=$(($? & 8))
.fi
This looks at only at bit 3 of the exit status
.B $?
(since 8=2^3). The shell variable
$smartstat will be nonzero if SMART status check returned 'disk
failing' and zero otherwise.
.PP
.SH AUTHOR
Bruce Allen
.B smartmontools-support@lists.sourceforge.net
.fi
University of Wisconsin \- Milwaukee Physics Department
.PP
.SH CREDITS
.fi
This code was derived from the smartsuite package, written by Michael
Cornwell, and from the previous ucsc smartsuite package. It extends
these to cover ATA-5 disks. This code was originally developed as a
Senior Thesis by Michael Cornwell at the Concurrent Systems Laboratory
(now part of the Storage Systems Research Center), Jack Baskin School
of Engineering, University of California, Santa
Cruz. http://ssrc.soe.ucsc.edu/.
.SH
HOME PAGE FOR SMARTMONTOOLS:
.fi
Please see the following web site for updates, further documentation, bug
reports and patches:
.nf
.B
http://smartmontools.sourceforge.net/
.SH
SEE ALSO:
.B
smartd (8)
.SH
REFERENCES FOR S.M.A.R.T.
.fi
If you would like to understand better how S.M.A.R.T. works, and what
it does, a good place to start is Section 8.41 of the 'AT
Attachment with Packet Interface-5' (ATA/ATAPI-5) specification. This
documents the S.M.A.R.T. functionality which the smartmontools
utilities provide access to. You can find Revision 1 of this document
at:
.nf
.B
http://www.t13.org/project/d1321r1c.pdf
.fi
Future versions of the specifications (ATA/ATAPI-6 and ATA/ATAPI-7),
and later revisions (2, 3) of the ATA/ATAPI-5 specification are
available from:
.nf
.B
http://www.t13.org/#FTP_site
.fi
The functioning of S.M.A.R.T. is also described by the SFF-8035i
revision 2 specification. This is a publication of the Small Form
Factors (SFF) Committee, and can be obtained from:
.TP
\
SFF Committee
.nf
14426 Black Walnut Ct.
.nf
Saratoga, CA 95070, USA
.nf
SFF FaxAccess: +01 408-741-1600
.nf
Ph: +01 408-867-6630
.nf
Fax: +01 408-867-2115
.nf
E-Mail: 250-1752@mcimail.com.
.PP
Please let us know if there is an on\-line source for this document.
.SH
CVS ID OF THIS PAGE:
$Id: smartctl.8,v 1.31 2003/01/01 08:21:00 ballen4705 Exp $