Linux / Ubuntu / hdparm: Identifying drive features and setting sleep patterns

Preparing the storage

Install hdparm and smartmontools

Install hdparm and the SMART monitoring tools.

# apt install hdparm smartmontools

Identify the right hard drive

Make sure you identify the correct drive, as some of the commands will destroy data. If you don’t understand the commands, then check them first. You have been warned.

Identify the block size

Knowing the block size of the device is important. It will help optimising writes, and in the case of SSD or flash drives avoid write amplification and wear and tear.

[List details of all drives]

# fdisk -l

[...]
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 512 bytes / 4096 bytes
[..]

[List details of a specific drive]

# fdisk -l /dev/sda
[...]
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
[...]
# smartctl --all /dev/sda
[...]
Sector Sizes:     512 bytes logical, 4096 bytes physical
[...]

Pay attention to the physical/optimal size. This is the one that matters.

SSDs will hide the true size of the pages and blocks. Even the same drive models might be built with different components, so getting it right is tricky.

Some suggest that 4kB is a generally good size for SSDs: https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ssd-server-storage-applications-paper.pdf

Use the drive’s sector physical size to match the ZFS ashift (block size).

Retrieve drive IDs

When setting ZFS pools or using disk tools it is best to avoid using device names as they can easily change their order. Using the drive ID or serial will ensure that no matter in which port or in which order the drives are plugged it will be the correct drive selected.

This matters with any disk accessing utility if you have several drives, or will be inserting external drives regularly.

$ ls -l /dev/disk/by-id/
[...]

lrwxrwxrwx 1 root root  9 Mar  9 13:16 usb-TOSHIBA_External_USB_3.0_20150612015531-0:0 -> ../../sda

[...]

You can also extract model and serial numbers with hdparm.

# hdparm -i /dev/sda

/dev/sda:

 Model=WDC WD10EZEX-08WN4A0, FwRev=01.01A01, SerialNo=WD-WCC6Y5FXAPHV
 [...]

Even better, depending on the use of the drive, and if there is a plan to add mirror drives, is to partition the drive to ensure there is enough space if a different drive model is later added. Although I believe ZFS already does this and rounds down partitions using Mebibytes.

Test for damaged sectors

An additional and optional step is to test the hard drive for damaged sectors. This kind of test tends to be destructive so it is best if it is done before configuring the pools.

badblocks is a useful tool to achieve this.

It is installed by default, but if not you can do it manually.

# apt install e2fsprogs

A destructive test can be done with:

# badblocks -wsv -b 4096 /dev/sda

If you want to run the test while preserving the disk data you can run it in a non-destructive way. This will take longer.

# badblocks -nsv -b 4096 /dev/sda

ZFS has built-in checks and protection so in most cases you can skip this step.

Setting hard drive sleep patterns

Above I explained that using disk IDs is always a better idea. For simplicity, I will be using device names in several examples below, but I still advise using IDs or serials.

Check if the disk supports sleep

Check if the drive supports standby.

# hdparm -y /dev/sda

If supported the output will be:

/dev/sda:
 issuing standby command

Any other output might indicate that the drive doesn’t support sleep, or that a different tool/setting might be required.

Next, check if the drive supports write cache:

# hdparm -I /dev/sda | grep -i 'Write cache'

The expected output is:

           *    Write cache

The * indicates that the feature is supported.

An example of a complete hdparm output from a drive is shown below for reference. Different drives, with different features, will show different output, or even none at all.

# hdparm -I /dev/sda

/dev/sda:

ATA device, with non-removable media
        Model Number:       TOSHIBA MD04ACA500                      
        Serial Number:      55OBK0SPFPHC
        Firmware Revision:  FP2A    
        Transport:          Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
        Supported: 8 7 6 5 
        Likely used: 8
Configuration:
        Logical         max     current
        cylinders       16383   16383
        heads           16      16
        sectors/track   63      63
        --
        CHS current addressable sectors:    16514064
        LBA    user addressable sectors:   268435455
        LBA48  user addressable sectors:  9767541168
        Logical  Sector size:                   512 bytes
        Physical Sector size:                  4096 bytes
        Logical Sector-0 offset:                  0 bytes
        device size with M = 1024*1024:     4769307 MBytes
        device size with M = 1000*1000:     5000981 MBytes (5000 GB)
        cache/buffer size  = unknown
        Form Factor: 3.5 inch
        Nominal Media Rotation Rate: 7200
Capabilities:
        LBA, IORDY(can be disabled)
        Queue depth: 32
        Standby timer values: spec'd by Standard, no device specific minimum
        R/W multiple sector transfer: Max = 16  Current = 16
        Advanced power management level: 128
        DMA: sdma0 sdma1 sdma2 mdma0 mdma1 *mdma2 udma0 udma1 udma2 udma3 udma4 udma5 
             Cycle time: min=120ns recommended=120ns
        PIO: pio0 pio1 pio2 pio3 pio4 
             Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
        Enabled Supported:
           *    SMART feature set
                Security Mode feature set
           *    Power Management feature set
           *    Write cache
           *    Look-ahead
           *    Host Protected Area feature set
           *    WRITE_BUFFER command
           *    READ_BUFFER command
           *    NOP cmd
           *    DOWNLOAD_MICROCODE
           *    Advanced Power Management feature set
                SET_MAX security extension
           *    48-bit Address feature set
           *    Device Configuration Overlay feature set
           *    Mandatory FLUSH_CACHE
           *    FLUSH_CACHE_EXT
           *    SMART error logging
           *    SMART self-test
           *    General Purpose Logging feature set
           *    WRITE_{DMA|MULTIPLE}_FUA_EXT
           *    64-bit World wide name
           *    WRITE_UNCORRECTABLE_EXT command
           *    {READ,WRITE}_DMA_EXT_GPL commands
           *    Segmented DOWNLOAD_MICROCODE
                unknown 119[7]
           *    Gen1 signaling speed (1.5Gb/s)
           *    Gen2 signaling speed (3.0Gb/s)
           *    Gen3 signaling speed (6.0Gb/s)
           *    Native Command Queueing (NCQ)
           *    Host-initiated interface power management
           *    Phy event counters
           *    Host automatic Partial to Slumber transitions
           *    Device automatic Partial to Slumber transitions
           *    READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
                DMA Setup Auto-Activate optimization
                Device-initiated interface power management
           *    Software settings preservation
           *    SMART Command Transport (SCT) feature set
           *    SCT Write Same (AC2)
           *    SCT Error Recovery Control (AC3)
           *    SCT Features Control (AC4)
           *    SCT Data Tables (AC5)
           *    reserved 69[3]
Security: 
        Master password revision code = 65534
                supported
        not     enabled
        not     locked
        not     frozen
        not     expired: security count
                supported: enhanced erase
        more than 508min for SECURITY ERASE UNIT. more than 508min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 500003964bc01970
        NAA             : 5
        IEEE OUI        : 000039
        Unique ID       : 64bc01970
Checksum: correct

An example of a complete smartctl output from a drive is shown below also for reference. As mentioned earlier, different systems will generate different outputs.

# smartctl --all /dev/sda
smartctl 7.1 2019-12-30 r5022 [aarch64-linux-5.4.0-1029-raspi] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Toshiba 3.5" MD04ACA... Enterprise HDD
Device Model:     TOSHIBA MD04ACA500
Serial Number:    55OBK0SPFPHC
LU WWN Device Id: 5 000039 64bc01970
Firmware Version: FP2A
User Capacity:    5,000,981,078,016 bytes [5.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Mon Mar  8 15:02:10 2021 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Status not supported: Incomplete response, ATA output registers missing
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

General SMART Values:
Offline data collection status:  (0x80) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (  120) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 533) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0027   100   100   001    Pre-fail  Always       -       9003
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       9222
  5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   100   050    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   100   100   050    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0032   084   084   000    Old_age   Always       -       6418
 10 Spin_Retry_Count        0x0033   253   100   030    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       9212
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       482
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       104
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       9225
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       37 (Min/Max 15/72)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   253   000    Old_age   Always       -       0
220 Disk_Shift              0x0002   100   100   000    Old_age   Always       -       0
222 Loaded_Hours            0x0032   085   085   000    Old_age   Always       -       6393
223 Load_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
224 Load_Friction           0x0022   100   100   000    Old_age   Always       -       0
226 Load-in_Time            0x0026   100   100   000    Old_age   Always       -       214
240 Head_Flying_Hours       0x0001   100   100   001    Pre-fail  Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      5617         -
# 2  Short offline       Completed without error       00%      4702         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

More information about hdparm and smartctl is available on the following sites.

hdparm

https://wiki.archlinux.org/index.php/Hdparm#Power_management_configuration

http://www.htpcguides.com/spin-down-and-manage-hard-drive-power-on-raspberry-pi/

http://www.linux-magazine.com/Online/Features/Tune-Your-Hard-Disk-with-hdparm

smartctl

https://codeyarns.com/2016/12/21/how-to-use-smartctl/

https://linuxhandbook.com/check-ssd-health/

Configure the drive standby

Check the current standby configuration.

# hdparm -B /dev/sd[a-e]

/dev/sda:
 APM_level  = not supported

/dev/sdb:
 APM_level  = 254

/dev/sdc:
 APM_level  = not supported

/dev/sdd:
 APM_level  = 254

/dev/sde:
 APM_level  = 254
Values Description
1 to 127 Power management is enabled. The lower the value the more aggressive the power management will be.
128 to 254 Power management is enabled but doesn’t allow spindown
255 The feature is disabled.
not supported The drive doesn’t support APM.

The status can be set manually:

# hdparm -B 127 /dev/sda

The IDE power mode status can be queried with:

# hdparm -C /dev/sd[ab]

/dev/sda:
 drive state is:  active/idle

/dev/sdb:
 drive state is:  standby

For reference, several drives can be queried at the same time using different wildcards.

# hdparm -B /dev/sd?
# hdparm -C /dev/sd*
# hdparm -I /dev/sd[a-e]

Depending on the drive manufacturer and model you might need to query the settings with different flags. Check the man page.

[Get/set  the  Western  Digital Green Drive's "idle3" timeout value.]
# hdparm -J /dev/sd[a-e]

/dev/sda:
 wdidle3      = 300 secs (or 13.8 secs for older drives)

/dev/sdb:
 wdidle3      = 8.0 secs

/dev/sdc:
 wdidle3      = 300 secs (or 13.8 secs for older drives)

/dev/sdd:
 wdidle3      = 300 secs (or 13.8 secs for older drives)

/dev/sde:
 wdidle3      = 300 secs (or 13.8 secs for older drives)

From the man page:

A setting of 30 seconds is recommended for Linux use. Permitted values are from 8 to 12 seconds, and from 30 to 300 seconds in 30-second increments. Specify a value of zero (0) to disable the WD idle3 timer completely (NOT RECOMMENDED!).

There are flags for temperature (-H for Hitachi drives), acoustic management (-M), measuring cache performance (-T), and others. Go on, read that man page. 🙂

The -S flag sets the standby/spindown timeout for the drive. Basically, how long the drive will wait with no disk activity before turning off the motor.

Value Description
0 Disable the feature.
1 to 240 Five seconds multiples (a value of 120 means 10 minutes).
241 to 251 Thirty minutes intervals (a value of 242 means 1 hour).

Note that hdparm might wake the drive up when is queried. smartctl can query the drive without waking it.

# smartctl -i -d auto -n standby /dev/sda

Making the hdparm configuration persistent

Information about all the options is available at https://manpages.ubuntu.com/manpages/bionic/man5/hdparm.conf.5.html and also in the default configuration file generated by hdparm.

Example values from the data gathered above:

# APM setting (-B)
apm = 127

# APM setting while on battery (-B)
apm_battery = 127

# on/off drive's write caching feature (-W)
write_cache = on

# Standby (spindown) timeout for drive (-S)
spindown_time = 120

# Western  Digital  (WD)  Green Drive's "idle3" timeout value. (-J)
wdidle3 = 300

hdparm.conf method

Edit the configuration file:

# vim /etc/hdparm.conf

And insert an entry for each drive. Select only settings/features/values that are supported by that drive, otherwise the rest of the options won’t be applied. Test, test, test!

# Drive A
/dev/disk/by-id/ata-WDC_WD40NMZM-59Y94S1_WD-WX41D296P1XX {
apm = 127
apm_battery = 127
write_cache = on
spindown_time = 120
#wdidle3 = 300
}

udev method

In my case, the above method works. I couldn’t get this one to work on my system, but it could be because of the OS. I am leaving it for reference in case it might be of help.

# vim /etc/udev/rules.d/69-disk.rules

Create an entry for each drive editing the serial number and hdparm parameters. Make sure that only supported flags are added or it will fail.

ACTION=="add", KERNEL=="sd[a-z]", ENV{ID_SERIAL_SHORT}=="S3R14LNUMB3R", RUN+="/usr/bin/hdparm -B 127 -S 120 /dev/%k"

You can also apply the same parameters to all rotational drives (all non-SSD ones) in one go.

ACTION=="add|change", KERNEL=="sd[a-z]", ATTRS{queue/rotational}=="1", RUN+="/usr/bin/hdparm -B 127 -S 120 /dev/%k"