dunix student

406

Click here to load reader

Upload: ivobkaiser

Post on 29-Apr-2017

239 views

Category:

Documents


8 download

TRANSCRIPT

Page 1: Dunix Student

EY–X794E–SG.0003

AdvFS Internals and Troubleshooting

Course Guide

Page 2: Dunix Student
Page 3: Dunix Student

EY–X794E–SG.0003

AdvFS Internals and Troubleshooting

Course Guide

Page 4: Dunix Student

Notice

The information in this publication is subject to change without notice.

COMPAQ COMPUTER CORPORATION SHALL NOT BE LIABLE FOR TECHNICAL OREDITORIAL ERRORS OR OMISSIONS CONTAINED HEREIN, NOR FOR INCIDENTAL ORCONSEQUENTIAL DAMAGES RESULTING FROM THE FURNISHING, PERFORMANCE, ORUSE OF THIS MATERIAL.

This guide contains information protected by copyright. No part of this guide may be photocopied orreproduced in any form without prior written consent from Compaq Computer Corporation.

The software described in this guide is furnished under a license agreement or nondisclosure agreement.The software may be used or copied only in accordance with the terms of this agreement.

Other product names mentioned herein may be trademarks and/or registered trademarks of theirrespective companies.

©2000 Compaq Computer Corporation. All rights reserved. Printed in the USA.

Aero, ALPHA, ALPHA AXP, AlphaServer, AlphaStation, Armada, BackPaq, COMPAQ, CompaqInsight Manager, CompaqCare logo, Counselor, DECterm, Deskpro, DIGITAL, DIGITAL logo,DIGITAL Alpha Systems, Digital Equipment Corporation, DIGITAL UNIX, DirectPlus, FASTART,Himalaya, InfoPaq, Integrity, LicensePaq, Ministation, NetFlex, NonStop, OpenVMS, PaqFax,Presario, ProLiant, ProLinea, ProSignia, QuickBack, QuickFind, Qvision, RDF, RemotePaq, RomPaq,ServerNet, SERVICenter, SmartQ, SmartStart, SmartStation, SolutionPaq, SpeedPaq, StorageWorks,Systempro/LT, Tandem, TechPaq, TruCluster, Tru64 UNIX, registered in United States Patent andTrademark Office.

Atalla, C-Series, Expand, FOX, Guardian, iTP, Measure, Netelligent, and PointView are trademarks ofCompaq Computer Corporation.

Microsoft, Windows, and Windows NT are registered trademarks of Microsoft Corporation.MIPS is a trademark of MIPS Computer Systems. Motif, OSF and OSF/1 are registered trademarks ofthe Open Software Foundation. NFS is a registered trademark of Sun Microsystems, Inc. Oracle is aregistered trademark and Oracle7 is a trademark of Oracle Corporation. POSIX is a registeredtrademark of the Institute of Electrical and Electronics. PostScript is a registered trademark of AdobeSystems, Inc. UNIX is a registered trademark licensed exclusively through X/Open Company Ltd. XWindow System is a trademark of the Massachusetts Institute of Technology. Intel, Pentium, and IntelInside are registered trademarks and Xeon is a trademark of Intel Corporation.

UNIX is a trademark in the US and other countries, licensed exclusively through X-Open CompanyLtd.

AdvFS Internals and TroubleshootingCourse GuideJanuary 2000

Page 5: Dunix Student

Contents

About This Course

About This Course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xivIntroduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xivCourse Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xivPlace in Curriculum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xivTarget Audience. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xivPrerequisites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvCourse Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvNongoals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

Taking This Course. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xviCourse Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xviCourse Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xviChapter Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviiTime Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xviiiCourse Conventions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xviiiResources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xix

1 Advanced File System Concepts

About This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-2Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-2Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-2Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-2

Introducing AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-3Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-3File Domains and Filesets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-3AdvFS Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-3AdvFS Capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-4Filesets and Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-5Volumes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-6Filesets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-7

Using Extent-Based Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-8Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-8Extent Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-8Displaying Extents Using the showfile Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-9

Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-12Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-12Why Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-12Logging a Transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-13AdvFS Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-13

Cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-15Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-15Cloning a Fileset Using clonefset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-15

iii

Page 6: Dunix Student

Fileset Clones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-15Cloning Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-16

Striping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-17Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-17File Striping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-17

Using Trashcans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-19Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-19Overview of Trashcans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-19

Reviewing AdvFS Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-21Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-21File Domain Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-21Fileset Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-21File Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-21

Examining AdvFS Architecture and Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-22Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-22AdvFS Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-22AdvFS Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-24AdvFS in Tru64 UNIX V5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-25

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-26Introducing AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-26Using Extent-Based Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-26Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-26Cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-27Striping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-27Using Trashcans. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-27Reviewing AdvFS Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-27Examining AdvFS Architecture and Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-27

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-29Introducing AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-29Using Extent-Based Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-29Cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-29Using Traschcans. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-30Striping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-30

Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-31Introducing AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-31Using Extent-Based Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-33Cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-36Using Trashcans. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-39Striping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-40

2 AdvFS On-Disk Structures

About This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-2Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-2Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-2Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-2

Introducing AdvFS On-Disk Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-3Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-3Two-Level Implementation of AdvFS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-3.tags Directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-4BAS On-Disk Format: Everything is a Bitfile. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-5Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-7Mcells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-7AdvFS File Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-8

iv

Page 7: Dunix Student

Bitfile-Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-9Reusing Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-9

Describing BAS On-Disk Metadata Bitfiles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-10Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-10Domain and Volume Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-10Per Domain Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-11Per Volume Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-12Per Fileset Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-13Reserved Bitfile Special Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-15Metadata BitfileTags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-16.tags for Directory Entries for Metadata Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-17Bitfile Metadata Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-17Mcell Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-19Mcell Page Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-20RBMT Page 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-21BMT Page 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-21BMT Page Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-21Mcell Addresses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-22Reserved Mcell Addresses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-22Mcell Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-23Mcell Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-23Utilities for Viewing Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-24

Using Extent Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-25Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-25Extent Maps for Nonreserved Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-25Extent Maps for Reserved Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-25Encoding of Extents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-25

Using Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-27Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-27Tag File Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-27Tag File Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-29Tagmap Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-30Root Tag File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-31Fileset Tag File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-31Cloning through Fileset Tag File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-31Utility for Viewing Tag Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-33UNIX Directories. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-35POSIX Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-35AdvFS Tagfiles and Migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-36

Assigning Fragments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-37Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-37Fragment Bitfile. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-37Fragment Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-37Fragment Header . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-38Fragment Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-39Fragments and Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-41

Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile . . . . . . . . . . . . . . . . . . . . . . . . . .2-43Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-43Storage Bitmap Bitfile Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-43SBM Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-44Miscellaneous Bitfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-47

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-48Introducing AdvFS On-Disk Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-48Describing BAS On-Disk Metadata Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-49Using Extent Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-49

v

Page 8: Dunix Student

Using Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-50Assigning Fragments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-50Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile . . . . . . . . . . . . . . . . . . . . . .2-50

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-51Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-65

3 AdvFS In-Memory Structures

About This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-2Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-2Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-2Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-2

Examining AdvFS In-Memory Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-3Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-3Overview of In-Memory Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-3Big Picture of Data Structure Linkage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-3

Checking the VFS Layer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-5Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-5VFS Specific Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-5vnode Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-7mount Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-7

Explaining the FAS Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9FAS Layer Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9In-Memory Per File Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9bfNode Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-10fsContext Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-11In-Memory Per Fileset Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-13fileSetNode Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-14Fileset Quota Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-16User and Group Quota Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-17

Locating the BAS Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-18Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-18BAS Layer Structure Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-18Access to BAS Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-19bfAccess Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-19Managing bfAccess Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-19bfSet Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-20Finding bfSet Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-20domain Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-21Finding domain Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-21vd Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-24

Defining Other In-Memory Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-27Free Space Cache. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-27Bitfile Buffer Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-27I/O Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-28FTX State Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-28

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-29Examining AdvFS In-Memory Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-29Checking the VFS Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-29Explaining the FAS Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-29Locating the BAS Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-29Defining Other In-Memory Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-30

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-31

vi

Page 9: Dunix Student

Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-36

4 AdvFS System Calls and Kernel Interfaces

About This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-2Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-2Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-3Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-3

Describing Entries to AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-4Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-4VFS Switch Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-4vnode Switch Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-5UBC Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-6Device Driver Interface Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-6AdvFS Lightweight Context Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-7AdvFS I/O Completion Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-7True AdvFS System Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-8Types of AdvFS System Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-8Domains and Volumes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-10Filesets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-11Miscellaneous Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-12

Starting Up and Recovering in AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-13Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-13Startup and Recovery Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-13Mounting the File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-13Activating the Bitfile-Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-14Activating the Domain and Searching for Virtual Disks. . . . . . . . . . . . . . . . . . . . . . . . . . . .4-14Activating the Domain: Full Activation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-14Recovering a Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-14Recovery Pass: Recovers Domain Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-15

Providing Storage Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16BAS-Level Storage Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16FAS-Level Storage Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16Truncating Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-17

Cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-18Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-18Creating a Clone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-18Writing to a Cloned Original . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-19Reading from a Clone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-19Deleting Bitfile from Cloned Original. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-20Deleting a Bitfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-20Closing a Deleted Bitfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-20

Migrating Files and Deleting Filesets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-22Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-22Migrating a Bitfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-22Deleting a Fileset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-22

Documenting Threads. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-23Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-23AdvFS Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-23Fragment Bitfile Thread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-23I/O Thread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-24AdvFS Cleanup Thread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-24

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-25

vii

Page 10: Dunix Student

Describing Entries to AdvFS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-25Starting Up and Recovering in AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-25Providing Storage Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-25Cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-26Migrating Files and Deleting Filesets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-26Documenting Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-26

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-27Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-28

5 Troubleshooting AdvFS

About This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-2Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-2Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-2Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-2Case Study Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-2

Describing AdvFS Troubleshooting Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-3AdvFS Commands and Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-3Troubleshooting Tips and Practices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-4

Troubleshooting File System Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-8Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-8Recognizing File System Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-8Causes of AdvFS Corruption. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-8No Valid File System Error Message . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-9Mount File System Operation Crashes the System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-9Localized Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-10Generalized Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-11Domain Panic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-12

Resolving Known AdvFS Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-14Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-14Log Half-Full Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-14Fixing Log Half-Full Problems: Reducing Fragmentation . . . . . . . . . . . . . . . . . . . . . . . . . .5-14Determining Appropriate Log Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-14Fixing Log Half Full Problems: Increasing Log Size Using switchlog. . . . . . . . . . . . . . . . .5-15Fixing Log Half Full Problems: Increasing Log Size Using mkfdmn. . . . . . . . . . . . . . . . . .5-16BMT Exhaustion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-16Avoiding BMT Exhaustion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-16BMT Extent Map Allocations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-18BMT Exhaustion: Fixing the Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-18

Case Study 1: RBMT Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-20Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-20Problem Statement: Case Study 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-20Configuration: Case Study 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-20Problem Description: Case Study 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-20Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-20

Case Study 2: Fragment-Free List Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-30Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-30Problem Statement: Case Study 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-30Configuration: Case Study 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-30Problem Description: Case Study 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-30Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-30Things Attempted: Case Study 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-33Final Solution/Summary: Case Study 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-33

Case Study 3: Corruption and System Panic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-34

viii

Page 11: Dunix Student

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-34Problem Statement: Case Study 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-34Configuration: Case Study 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-34Problem Description: Case Study 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-34Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-35Things Attempted: Case Study 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-48Final Solution/Summary: Case Study 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-48

Using the salvage Utility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-50What is salvage? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-50salvage Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-52When to Use salvage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-53Using salvage in Conjunction with Backup Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-53Using salvage in the Absence of Backup Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-54Using salvage in the Case of Very Large Domains. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-54Using salvage in the Case of Massive Metadata Corruption. . . . . . . . . . . . . . . . . . . . . . . . .5-54

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-55Describing AdvFS Troubleshooting Practices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-55Troubleshooting File System Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-55Resolving Known AdvFS Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-55Performing Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-56

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-58Describing AdvFS Troubleshooting Practices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-58

Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-59Describing AdvFS Troubleshooting Practices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-59

A AdvFS Commands and Utilities

About This Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2

AdvFS Commands and Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3Commands in Certain Versions of Tru64 UNIX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3addvol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3DIGITAL UNIX V4.x Specific Information for addvol( -x and -p will be retired) . . . . . . . A-5advfsstat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6advscan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7balance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-9chfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-10chfsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-12chvol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-13defragment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-14logread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-16migrate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-16mkfdmn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-18DIGITAL UNIX V4.x Specific mkfdmn Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-19mkfset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-20mountlist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-21ncheck . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-21nvbmtpg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-22nvfragpg. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-25nvlogpg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-27nvtagpg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-29rmfdmn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-32

ix

Page 12: Dunix Student

rmfset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-32rmvol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-33salvage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-34savemeta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-36shblk. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-37shfragbf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-37showfdmn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-38showfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-39showfsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-41stripe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-42switchlog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-42tag2name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-43vbmtchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-44vbmtpg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-44vdf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-45vdump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-46verify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-48vfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-50vfilepg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-50vfragpg. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-53vlogpg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-53vlsnpg. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-53vrestore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-54vsbmpg. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-57vtagpg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-59

x

Page 13: Dunix Student

xi

Tables

0-1 Course Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii0-2 Course Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii1-1 Trashcan Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-192-1 Metadata Bitfile Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-172-2 .tags for Metadata Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-175-1 AdvFS Commands and Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-35-2 Log Size Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-145-3 BMT Extent Map Allocations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-185-4 salvage Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-50

Page 14: Dunix Student

xii

Figures

0-1 Course Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii1-1 Two Filesets Drawing Storage from a Domain Containing Three Volumes . . . . . . . . . . . . . . . .1-51-2 Filesets are not Necessarily Related to Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-61-3 Extent-Based Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-91-4 Event Sequence for Logging a Transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-131-5 Fileset Clones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-161-6 File Striping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-181-7 Trashcans. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-201-8 File Access: The Big Picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-231-9 AdvFS Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-242-1 Two-Level Implementation of AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-42-2 Using AdvFS Metadata to Translate FAS to BAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-62-3 BAS On-Disk Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-72-4 Tag Number to BMT mcell to Logical Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-82-5 BAS On-Disk Metadata Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-112-6 Fileset Tag Directory Locating Primary Mcell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-142-7 Reserved Bitfiles On Disk Layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-192-8 Mcell Page Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-202-9 File Access Through Tag File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-282-10 Tag File Allowing Transparent Data Move . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-292-11 Fileset Tag File Before and After Cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-322-12 Clone Structures After Data Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-332-13 Relationship to POSIX Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-362-14 Fragment Bitfile Locating Fragment Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-383-1 Big Picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-43-2 In-Memory Per File Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-133-3 In-Memory Per Fileset Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-16

Page 15: Dunix Student

About This Course

xiii

Page 16: Dunix Student

About This Course

in this

will

apter

t this

ourse

and

About This Course

IntroductionThis section describes the contents of the course, suggests ways in which you can most effectively use the materials, and sets up the conventions for the use of terms in the course. It includes:

• Course description — a brief overview of the course contents

• Target audience — who should take this course

• Prerequisites — the skills and knowledge needed to ensure your success course

• Course goals and nongoals — what skills or knowledge the course will andnot provide

• Course organization — the structure of the course

• Course map — the sequence in which you should take each chapter

• Chapter descriptions — brief descriptions of each chapter

• Time schedule — an estimate of the amount of time needed to cover the chmaterial and lab exercises

• Course conventions — explanation of symbols and signs used throughoucourse

• Resources — manuals and books to help you successfully complete this c

Course DescriptionThis lecture-lab course focuses on the internals of the Advanced File System(AdvFS). Practical troubleshooting training on AdvFS is also presented.

Place in CurriculumThis course is part of the UNIX advanced curriculum for system administrationsupport personnel.

Target AudienceThis course is designed for system administrators and support engineers whoservice or support AdvFS configurations.

xiv

Page 17: Dunix Student

About This Course

PrerequisitesTo get the most from this course, you should be able to:

• Install and manage a Tru64 UNIX system.

• Install layered products and register license PAKs.

• Troubleshoot the operating system and make adjustments to improve performance.

• Manage traditional UNIX disk partitions.

• Perform typical UNIX system management tasks.

• Use the Tru64 UNIX kernel configuration tools.

• Set up and manage the LSM, AdvFS, and hardware Redundant Arrays of Independent Disks (RAID) using the command line or graphical user interface.

These prerequisites can be satisfied by taking the following courses:

• Tru64 UNIX System Administration lecture-lab or self-paced

• AdvFS, LSM, and RAID Configuration and Management

Course GoalsTo support customers with complex configurations using AdvFS, you should be able to:

• Describe AdvFS internals (architecture, on-disk structures, in-memory structures, algorithms, functions/commands).

• Troubleshoot AdvFS problems.

NongoalsThis course does not cover the following topics:

• RAID hardware and software concepts

• Hierarchical Storage Operating Firmware (HSOF) architecture and design

• Hardware RAID, LSM, and AdvFS configuration and management

• Hardware installation/maintenance and troubleshooting

• Tru64 UNIX operating system internals

xv

Page 18: Dunix Student

Taking This Course

Taking This Course

Course OrganizationThis Course Guide is divided into chapters designed to cover a skill or related group of skills required to fulfill the course goals. Illustrations are used to present conceptual material. Examples are provided to demonstrate concepts and commands.

In this course, each chapter consists of:

• An introduction to the subject matter of the chapter.

• One or more objectives that describe the goals of the chapter.

• A list of resources, or materials for further reference. Some of these manuals are included with your course materials. Others may be available for reference in your classroom or lab.

• The text of each chapter, which includes outlines, tables, figures, and examples.

• The summary highlights the main points presented in the chapter.

• The exercises enable you to practice your skills and measure your mastery of the information learned during the course.

Course MapThe Course Map shows how each chapter is related to other chapters and to the course as a whole. Before studying a chapter, you should master all of its prerequisite chapters. The prerequisite chapters are depicted before the following chapters on the Course Map. The direction of the arrows determines the order in which the chapters should be covered.

xvi

Page 19: Dunix Student

Taking This Course

h

It

Figure 0-1: Course Map

Chapter DescriptionsA brief description of each chapter is listed below.

• Advanced File System Concepts — an overview of the Advanced File System(AdvFS). It includes terminology such as logging, clones, striping and trascans, as well as an overview of AdvFS commands.

• AdvFS On-Disk Structures — information about AdvFS on-disk structures. includes a review of BAS on-disk metadata bitfiles, extent maps, tags, fragments and storage bitmap and miscellaneous bitfiles.

• AdvFS In-Memory Structures — information about AdvFS in-memory structures, including VFS, FAS and BAS layers.

• AdvFS System Calls and Kernel Interfaces — an overview of the entries intoAdvFS. It follows startup and recovery, storage management, cloning, filemigration and threads.

Advanced FileSystem Concepts

AdvFS In-Memory Structures

AdvFS On-Disk Structures

Troubleshooting AdvFS

advfsi01

AdvFS System Callsand Kernel Interfaces

xvii

Page 20: Dunix Student

Taking This Course

round

• Troubleshooting AdvFS — AdvFS troubleshooting tips and case studies.

Time ScheduleThe amount of time required for this course depends on each student's backgknowledge, experience, and interest in the various topics.

Use the following table as a guideline.

Course ConventionsThis book uses the following conventions.

Table 0-1: Course Schedule

Day Chapter/Appendix Number

Chapter/Appendix Name Lecture/ Reading Hours

Lab/Exercise Hours

1 1 Advanced File System Concepts 1 hours

2 AdvFS On-Disk Structures 2 hour 1 hour

A AdvFS Commands and Utilities Appendix

1 hour

2 3 AdvFS In-Memory Structures 2 hours 2 hours

4 AdvFS System Calls and Kernel Interfaces

1 hour 1 hour

5 Troubleshooing AdvFS 1 hour

Table 0-2: Course Conventions

Convention Description

keyword Keywords (for emphasis) and websites are displayed in this typeface.

example Examples, commands, options, and pathnames are displayed in this typeface.

command(n) Cross-references to command documentation include the section number in the reference pages. For example, fstab(5) means fstab is referenced in Section 5.

$ A dollar sign represents the user prompt.

# A number sign represents the superuser prompt.

[key] This symbol indicates that the named key on the keyboard is pressed.

.

.

.

In examples, a vertical ellipsis indicates that not all lines in the example are shown.

[ ] In syntax descriptions, brackets indicate items that are optional.

variable Italics indicate new terms as well as items that are variable (in syntax descriptions).

xviii

Page 21: Dunix Student

Taking This Course

ResourcesFor more information on the topics in this course, see the following:

• Tru64 UNIX AdvFS Reference Pages

• POLYCENTER Advanced File System and Utilities for DIGTIAL UNIX; Guide to File System Administration

• Tru64 UNIX System Configuration and Tuning

xix

Page 22: Dunix Student

Taking This Course

xx

Page 23: Dunix Student

1

Advanced File System Concepts

Advanced File System Concepts 1-1

Page 24: Dunix Student

About This Chapter

About This Chapter

IntroductionThis chapter presents an overview of the features of the Advanced File System (AdvFS). It prepares students for the examination of the internal support for the concepts reviewed here.

ObjectivesTo describe AdvFS at an introductory level, you should be able to:

• Define the terms file domains, filesets, and volumes.

• Describe extent-based storage.

• Describe logging and the benefits of transactions.

• Describe at an advanced level: clones, file striping, and trashcan directories.

• Describe the AdvFS architecture and on-disk format at a high level.

ResourcesFor more information on topics in this chapter as well as related topics, see the following:

• Advanced File System Administration (Tru64 UNIX Version 4.0f or higher)

• AdvFS Reference pages

1-2 Advanced File System Concepts

Page 25: Dunix Student

Introducing AdvFS

e

e a

Introducing AdvFS

OverviewTo study the internals of AdvFS, the basic concepts must be understood. This section reviews the following AdvFS terms and concepts:

• File domains and filesets

• AdvFS characteristics

• AdvFS capabilities

• Filesets and partitions

• Volumes

• Filesets

File Domains and FilesetsFilesets and file domains are distinguishing components of AdvFS. Filesets are similar to mountable file systems. A file domain represents the pool of storage from which the filesets allocate their storage space. The term volume represents the actual storage entity within a domain.

• File domain is a named set of one or more volumes that provides a shared pool of physical storage.

• Volume is any mechanism that behaves like a UNIX block device.

— An entire disk

— A disk partition

— A logical volume configured with the Logical Storage Manager (LSM)

• Fileset represents a portion of the directory hierarchy.

— Follows the logical structure of a traditional UNIX file system.

— Hierarchy of directory names and file names. It's what you mount.

AdvFS CharacteristicsThe “pools of storage” called domains within AdvFS are characteristics that makAdvFS an advanced file system. Most other file systems lack the ability to drawstorage from a pool shared among multiple filesets.

AdvFS goes beyond UFS by allowing you to create multiple filesets that sharcommon pool of storage within a defined file domain.

Advanced File System Concepts 1-3

Page 26: Dunix Student

Introducing AdvFS

ding

sical

offer

ginal.

t that

ree

A fileset is similar to a file system in the following ways:

• You can mount filesets like you can mount file systems.

• Filesets can have quotas enabled.

• Filesets can be backed up.

AdvFS separates the directory layer from the storage layer. It allows management of the physical storage separately from the directory hierarchy. The directory hierarchy handles file naming and the file system interface – opening and reafiles.

The physical storage layer handles write-ahead logging, file allocation, and phydisk I/O functions. It can move a file from one disk to another within a storagedomain without changing its pathname.

AdvFS CapabilitiesSome special capabilities are available within AdvFS, such as filesets, which features not provided by other file systems:

• You can clone a fileset and back it up while users are still accessing the ori

• A fileset can span several disks (volumes) in a file domain.

The most basic advancement provided by AdvFS is the ability to create a filesecan span multiple volumes.

The following figure depicts two filesets that draw their disk storage from the thvolumes within the domain.

1-4 Advanced File System Concepts

Page 27: Dunix Student

Introducing AdvFS

Figure 1-1: Two Filesets Drawing Storage from a Domain Containing Three Volumes

Filesets and PartitionsEach fileset is a uniquely named set of directories and files that form a subtree structure.

The following figure distinguishes a fileset from a partition.

Domain with 3 volumes

Fileset A Fileset B

Advanced File System Concepts 1-5

Page 28: Dunix Student

Introducing AdvFS

ses links

Figure 1-2: Filesets are not Necessarily Related to Partitions

Commands associated with domains, volumes and filesets are: mkfdmn, mkfset, addvol, rmvol, showfdmn, showfsets.

VolumesVolumes represent the basic storage building block for AdvFS. They are sometimes referred to as virtual disks because they function as a disk would in less sophisticated file systems.

An AdvFS volume is:

• A physical storage building block for a file domain.

• Any logical UNIX block device:

— "Real" disk partition

— Hardware RAID logical disk

— LSM volume

• Administered from /etc/fdmns.

Note that the contents of /etc/fdmns should not be changed manually. Any changes should be introduced using AdvFS utilities (such as addvol, rmvol, mkfdmn, and so forth).

The following example shows a symbolic link in a directory under the /etc/fdmns directory pointing to the volume (actual disk storage) that compothis domain. If the domain had more than one volume, there would be multiple shown. Note that there are directories under /etc/fdmns for each domain.

Filesets

File Domain Volumes (Disk Partitions)

Filesets != Partitions

1-6 Advanced File System Concepts

Page 29: Dunix Student

Introducing AdvFS

Example 1-1: Displaying a Directory Under /etc/fdmns

# ls -l /etc/fdmns/usr_domaintotal 0lrwxr-xr-x 1 root system 15 Mar 17 17:56 dsk2g -> /dev/disk/dsk2g

FilesetsFilesets are the mountable entities within AdvFS. They function similarly to UFS file systems.

A fileset is:

• A file or directory tree mapped to a domain.

• Created using the command mkfset or through dxadvfs.

• Mounted like a file system.

• Administered from /etc/fstab file.

The following example shows a simple /etc/fstab file with the last line representing a request to mount the usr fileset, which is within the usr_domain domain on the /usr mount point.

Example 1-2: Mounting Through /etc/fstab

# cat /etc/fstab/dev/disk/dsk2a / ufs rw 1 1/proc /proc procfs rw 0 0usr_domain#usr /usr advfs rw 0 2

Advanced File System Concepts 1-7

Page 30: Dunix Student

Using Extent-Based Storage

Using Extent-Based Storage

OverviewAdvFS uses an extent-based storage system to store the data within a file. An extent-based strategy allows a contiguous file to be located using a single extent.

This section introduces:

• Extent concepts

• Using the showfile command to display extents

Extent ConceptsAny attempt to create a file with content involves allocating some disk space from the volumes within the domain. These chunks of disk space are referred to as file extents.

AdvFS always attempts to write each file to disk as a set of contiguous pages called an extent.When a file consists of a few large extents and file access is sequential,I/O performance should be optimal.

An extent map translates the bitfile pages (8192 bytes each) to disk blocks (512 bytes each). The AdvFS storage allocation policy adds pages to a file by preallocating one-fourth of the file size up to 16 pages each time the file is appended. This fosters larger extents. When the file is closed, excess preallocated space is truncated, so space is not wasted. When a file uses only part of the last page, a file fragment is created.

Rather than wasting the rest of the page in the extent, the space is allocated from a special file, the fileset fragment file. Each fileset has a frag file, containing seven groups of fragments, from 1Kb to 7Kb in size. A fragment is allocated from the appropriately sized group.

The following figure shows the relationship between the logical file, the extent map, and the actual disk space.

1-8 Advanced File System Concepts

Page 31: Dunix Student

Using Extent-Based Storage

Figure 1-3: Extent-Based Storage

Displaying Extents Using the showfile CommandThe showfile command is one of the most heavily used in AdvFS. Use showfile to view AdvFS details pertaining to an individual file.

The showfile command displays the extent map of each file. An extent is a contiguous area of disk space that the file system allocates to a file.

• Simple files have one extent map.

• Striped files have an extent map for every stripe segment.

The following example shows a file with a single extent of three pages in size.

logical fileextent 1 extent 2

Extent Map

Disk Space

extent 1 extent 2

Advanced File System Concepts 1-9

Page 32: Dunix Student

Using Extent-Based Storage

Example 1-3: Using showfile to Display a Contiguous File

# showfile -x /usr/users/obrien/disktab

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File 596b.8001 1 16 3 simple ** ** async 100% disktab

extentMap: 1 pageOff pageCnt vol volBlock blockCnt 0 3 1 576496 48 extentCnt: 1#

XtntType is the extent type, which can be:

• simple, a regular AdvFS file without special extents.

• stripe, a striped file.

• symlink, a symbolic link to a file (ufs, nfsv3, and so on).

The following example shows an empty striped file with two stripes.

Example 1-4: Using showfile to Display a Striped File with Two Stripes

# showfile -x /usr/dennis/stripe1

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File c.8001 2 16 0 stripe 2 8 async 100% stripe1

extentMap: 1 pageOff pageCnt volIndex volBlock blockCnt extentCnt: 0

extentMap: 2 pageOff pageCnt volIndex volBlock blockCnt extentCnt: 0

The showfile command cannot display attributes for symbolic links or non-AdvFS files.

This example shows the limitations of showfile when used on a UFS file.

1-10 Advanced File System Concepts

Page 33: Dunix Student

Using Extent-Based Storage

Example 1-5: showfile Command Output from a UFS File

# showfile -x /vmunix

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File ** ** ** ** ufs ** ** ** ** vmunix

A simple file has one extent map while a striped file has more than one extent map.

extentMap: 1 pageOff pageCnt vol volBlock blockCnt 0 3 1 576496 48 extentCnt: 1

An extent map displays the following information:

pageOff Starting page number of the extent

pageCnt Number of 8K pages in the extent

vol Number indicating which volume within the domain contains this file

volBlock Starting block number of the extent

blockCnt Number of blocks in the extent

extentCnt Number of extents

Advanced File System Concepts 1-11

Page 34: Dunix Student

Logging

Logging

OverviewA characteristic of AdvFS is that the recovery time after a power failure or crash is minimal. This section introduces AdvFS logging.

• Why logging

• Logging a transaction

• AdvFS logging

Why LoggingAnother distinguishing characteristic of AdvFS is metadata logging. AdvFS tracks alterations to on-disk metadata by logging the transaction as it occurs. Many file system operations involve several widely separated writes to disk.

A transaction usually consists of more than one write. A crash in between the writes leaves the on-disk file system inconsistent.

The two main benefits of using transactions and logging are:

• Fast crash recovery

This is achieved by using the transaction log to redo committed transactions and undo uncommitted transactions.

The log has a fixed size regardless of the domain size so this bounds the time for recovery.

This is in contrast to UFS which relies on fsck to repair the file system (the time to run fsck is proportional to the number of files in a file system, so as disks get bigger, fsck takes longer).

On average AdvFS crash recovery takes about 10 to 15 seconds.

• Improved performance for metadata-intensive operations

Since all metadata modifications are first written to the log and can be recreated using just the log, the file system can write the actual modifications to disk at a later time.

This allows the file system to wait until it can do bigger I/Os by collecting many unrelated metadata modifications into fewer I/Os.

This is in contrast to UFS which relies on ordered synchronous writes to maintain metadata consistency. The writes are ordered such that fsck can easily repair inconsistencies.

1-12 Advanced File System Concepts

Page 35: Dunix Student

Logging

Logging a TransactionThe following figure shows the event sequence when a transaction is logged.

Figure 1-4: Event Sequence for Logging a Transaction

å Storage is allocated in the bitfile metadata table (BMT) (log record 1).

� The bitfile tag slot is allocated (log record 2).

ê The directory entry is changed (log record 3).

� The transaction is committed (log record 4).

� The buffered log records are written to disk.

ñ The buffered bitfile pages are written to disk and the log pages are removed.

AdvFS LoggingAdvFS logging consists of:

• Modifications to its own metadata (internal structures)

• Not user file data (unless atomic write data logging has been enabled using chfile -L)

For each transaction, AdvFS:

• Writes a series of log records describing all changes for an operation to disk.

• Performs changes (writes changed blocks to disk).

In case of crash, on restart, the on-disk log indicates which transactions are complete.

Tag Directory

"log"tagN

Directory

1 2 3 Commit 4 5

6

1

2

3

Log

intentions commit record

Advanced File System Concepts 1-13

Page 36: Dunix Student

Logging

The transaction log records changes to metadata (bitfile and directory data). For example, file creation requires modifying more than one on-disk structure:

• File directory to insert a new file name

• Fileset tag directory to allocate the new file’s tag

• Bitfile table to allocate an entry for the new file

When a file is created, these structures must all be updated.

The changes are first written to a log and then to disk. If the creation process is interrupted by a system crash, the creation process can be completed or undone based on the log.

If the log information is complete, finish the file creation on disk. If the log information is incomplete, undo the file creation on disk. This leaves the file system in a consistent state.

During crash recovery, the log is read to determine what changes must be completed or undone. Since the log has a limited size, the recovery time is bounded by the amount of time it takes to process the log. A larger domain will take no longer to recover. In practice, crash recovery takes less than 10 seconds.

1-14 Advanced File System Concepts

Page 37: Dunix Student

Cloning

n will

t

tion

t a ginal

Cloning

OverviewTo allow the uninterrupted use of AdvFS data while maintenance operations are in progress, a temporary “clone” of the file system can be requested. This sectiointroduce cloning concepts.

• Cloning a fileset using the clonefset command

• Fileset clones

• Cloning issues

Cloning a Fileset Using clonefset Another distinguishing capability within AdvFS is the ability to create a virtualclone of the fileset. Note that the clone is not an actual copy of the data.

Cloning a fileset involves these steps:

1. Locking the master (original) fileset

2. Creating the clone fileset

3. Copying the tag directory of the master to the clone

4. Incrementing the clone count in the master fileset

5. Setting the clone’s cloneID = clone count in the master

Handling a write to the master involves these steps:

1. Creating a bitfile for the file in the clone fileset if the cloned bitfile does noalready exist in the clone fileset

2. Modifying the clone fileset’s tag directory to reference the new file

3. Allocating an extent in the new file for the portion being written

4. Copying the original data to the new extent

5. Allowing the write occur to the file in the master

If the cloned bitfile already exists, it does not include an extent for the porbeing written.

Fileset ClonesA clone fileset is a read-only copy of a fileset created to capture fileset data aparticular time. The contents of the clone fileset can be backed up while the orifileset remains available to users.

The following figure shows cloning actions including copy-on-write (COW).

Advanced File System Concepts 1-15

Page 38: Dunix Student

Cloning

Figure 1-5: Fileset Clones

Cloning IssuesConsider these issues when working with cloned filesets:

• Applications should not be writing to the master when the clone is created.

Fortunately cloning time is very fast (seconds) due to copy-on-write.

• A clone is not a backup; it is a tool for minimizing down time for a fileset due to backups:

— Create clone of fileset

— Back up from clone

— Delete clone

Domain

Application

write

Backup tool

read

COWread

after clone is created, before any writes

first write to a block in the original (master) fileset

access to COW write blocks in the cloned fileset

1-16 Advanced File System Concepts

Page 39: Dunix Student

Striping

Striping

OverviewAdvFS provides file striping to potentially enhance the performance of file I/O intensive applications. This section introduces the concept of file striping.

File StripingAdvFS provides file-level striping to help spread the disk I/O over several volumes. The stripe utility directs a zero-length file (a file with no data written to it) to be spread evenly across several volumes within a file domain.

As data is appended to the file, the data is spread across the volumes. AdvFS determines the number of pages per stripe segment and alternates the segments among the disks in a sequential pattern.

Existing, nonzero-length files cannot be striped using the stripe utility.

To stripe an existing file, create a new file, use the stripe utility to stripe the new file, and copy the contents of the file you want to stripe into the new striped file. After copying the file, delete the nonstriped file.

Once a file is striped, you cannot use the stripe utility to modify the number of disks that a striped file crosses. To change the volume count of a striped file, you can create a second file with a new volume count, and then copy the contents of the first file into the second file. After copying the file, delete the first file.

The following figure depicts the blocks of a file (1,2,...) that is striped over three volumes. Note that block one is on the first volume, block two is on the second volume, and so forth.

Advanced File System Concepts 1-17

Page 40: Dunix Student

Striping

Figure 1-6: File Striping

DomainFile

12

345..

1-18 Advanced File System Concepts

Page 41: Dunix Student

Using Trashcans

e tory copy

me cally tory

tory.

ing

is a the

Using Trashcans

OverviewAn advanced file system should provide some user-level comforts as well as administrator and application-level features. This section introduces the notion of a “trashcan” directory from which deleted files can be retrieved.

Overview of TrashcansThe trashcan component within AdvFS allows administrators to prepare for thinadvertent removal of files. The deleted files are moved to the trashcan direcin case the user wants them back.You can configure your systems to retain aof deleted files.

Trashcan directories can be attached to one or more directories within the safileset. Once attached, any file deleted from an attached directory is automatimoved to the trashcan directory. The last version of a file deleted from a direcwith a trashcan attached can be returned to the original directory with the mv command.

Root-user privilege is not required to retrieve files from a trashcan directory.

Restrictions include:

• You can restore only the most recently deleted version of a file.

• You can attach more than one directory to the same trashcan directory; however, if you delete files with identical file names from the attached directories, only the most recently deleted file remains in the trashcan direcFiles deleted from the trashcan directory are unrecoverable.

The following table lists and defines the commands for setting up and managtrashcans.

The following picture depicts a standard directory hierarchy on the left. If theretrashcan directory associated with the fileset, any removed files are placed intrashcan. If necessary, the files can be moved from the trashcan back to the directory.

Table 1-1: Trashcan Commands

Command Function

mktrashcan Creates the trashcan

shtrashcan Shows the contents of the trashcan

rmtrashcan Removes the trashcan directory

Advanced File System Concepts 1-19

Page 42: Dunix Student

Using Trashcans

Figure 1-7: Trashcans

Trashcan Dir

rm

mv

1-20 Advanced File System Concepts

Page 43: Dunix Student

Reviewing AdvFS Commands

Reviewing AdvFS Commands

Overview This section reviews some of the commonly used AdvFS commands.

• File domain commands

• Fileset commands

• File commands

File Domain CommandsSome file domain commands are shown here.

Fileset CommandsSome fileset commands are shown here.

File CommandsThese are a few file commands.

Command Function

mkfdmn Creates a file domain

addvol Adds a new volume to the domain

rmvol Removes a volume from the domain

balance Distributes storage over the volumes evenly

defragment Makes files contiguous if possible

Command Function

mkfset Creates a fileset

chfsets Changes fileset characteristics

clonefset Creates a fileset clone

Command Function

migrate Moves a file from one volume to another

stripe Creates an empty striped file

mktrashcan Creates a trashcan directory

Advanced File System Concepts 1-21

Page 44: Dunix Student

Examining AdvFS Architecture and Components

tics

.

d and , and

ith a al

Examining AdvFS Architecture and Components

OverviewThis section examines how AdvFS is put together. The concepts are necessary to understand the internals of AdvFS.

• AdvFS architecture

• AdvFS components

• AdvFS in Tru64 UNIX V5

AdvFS ArchitectureAdvFS includes two kernel subsystems:

• File access subsystem (FAS)

— Emulates UNIX file system (UFS) and POSIX file and directory seman

— Uses bitfiles to implement files and directories

• Bitfile access subsystem (BAS)

— A bitfile is an array of 8K pages named via a tag.

— A tag is a unique identifier within a domain, similar to an inode number

The bitfile access subsystem manipulates bitfiles: create, open, read, write, adremove storage. It also interfaces with buffer cache, Volume Manager interfaceI/O scheduling. BAS provides:

• Transaction and log management

• Storage placement and management

• Domain and fileset management

The following figure depicts the software components that may be involved wtypical disk I/O. Note the AdvFS software component in the middle. The VirtuFile System (VFS) software directs the processing toward the appropriate filesystem specific software.

1-22 Advanced File System Concepts

Page 45: Dunix Student

Examining AdvFS Architecture and Components

Figure 1-8: File Access: The Big Picture

Once VFS directs the I/O processing to the AdvFS software, the AdvFS processing can be thought of as having two levels, FAS and BAS, as shown in the following figure.

user mode

kernel mode

Application

issues system calls( open(), close(), read(), write() )

Logical Storage ManagerLSM

Pseudo disk-driver

UNIX File SystemUFS

Uses inode structures to representfiles.

Advanced File SystemAdvFS

Uses bfNode structures to representfiles.

Network File SystemNFS

Uses rnode structures to representfiles.

Disk Driver

Virtual File SystemVFS

Uses vnode structure to representfiles from any file system.

Advanced File System Concepts 1-23

Page 46: Dunix Student

Examining AdvFS Architecture and Components

Figure 1-9: AdvFS Architecture Overview

AdvFS Components AdvFS consists of these components:

• File access subsystem - the POSIX file system layer in AdvFS

— Translates VFS file system requests into BAS requests

— Components:

* Mount, unmount, initialization

* Directory operations (lookup, create, delete)

— File operations (create, read, write, stat, delete, rename)

• Bitfile access subsystem - the bitfile layer in AdvFS

— Components:

* Domain operations (create, delete, open, close)

* Bitfile set operations (create, delete, clone, open, close)

* Bitfile operations (create, delete, open, close, migrate, read, write, add and remove stg)

* Transactions management operations (start, stop,fail, pin pg, pin record, lock, recover)

* Buffer cache operations (pin and unpin page, ref and deref page, flush bitfile, flush cache, prefetch pages, I/O queuing)

* Volume operations (add, remove)

VFS

Block Device Interface

File Access Subsystem (FAS)

Bitfile Access Subsystem (BAS)

VFS operationsvnode operations

Domains and VolumesBitfilesTransaction Management

1-24 Advanced File System Concepts

Page 47: Dunix Student

Examining AdvFS Architecture and Components

tem

The term bitfile refers to a generic file as is supported by the BAS. Files in the FAS are simply bitfiles to which the FAS applies POSIX semantics. Therefore, files are instantiated via bitfiles, and in general, file and bitfile are equivalent.

AdvFS in Tru64 UNIX V5Many changes are included in the latest release of Tru64 UNIX (V5).

Version five of Tru64 UNIX has a new version of the on-disk structure of AdvFS. The previous version of the AdvFS on-disk structure was V3. In Tru64 UNIX V5.0, the AdvFS on-disk structure will be at version four.

Additional features include faster directory searches for directories larger than 8K.

AdvFS has added some additional support for very large directories. The performance improvements include the creation of a B-tree index supporting directories greater than 8K in size. This dramatically improves file creation and deletion performance. Improvement becomes more noticeable when the directory contains more than ~2500 files.

Quota limits are now held in 8-byte fields yielding higher limits.

• Removal of metadata limitations (such as BMT page 0 restrictions)

• Direct I/O allowing I/O direct to the application’s address space (no UBC buffering)

• Smooth sync() operations to eliminate the update daemon 30-second sysI/O bursts

• SMP improvements

Advanced File System Concepts 1-25

Page 48: Dunix Student

Summary

Summary

Introducing AdvFSA file domain is a named set of one or more volumes that provides a shared pool of physical storage.

A fileset represents a portion of the directory hierarchy. Each fileset is a uniquely named set of directories and files that form a subtree structure.

A volume is any mechanism that behaves like a UNIX block device.

Using Extent-Based StorageThe Advanced File System always attempts to write each file to disk as a set of contiguous pages. The set of contiguous pages is called an extent. An extent map translates the bitfile pages (8192 bytes each) to disk blocks (512 bytes each).

The AdvFS storage allocation policy adds pages to a file by preallocating one-fourth of the file size up to 16 pages each time the file is appended. This fosters larger extents. When the file is closed, excess preallocated space is truncated, so space is not wasted.

When a file uses only part of the last page, a file fragment is created. Rather than wasting the rest of the page in the extent, the space is allocated from a special file, the fileset frag file. Each fileset has a frag file, containing seven groups of fragments, from 1Kb to 7Kb in size. A fragment is allocated from the appropriately sized group.

LoggingFast crash recovery is achieved by using the transaction log to redo committed transactions and undo uncommitted transactions. The log has a fixed size regardless of the domain size, so this bounds the time for recovery.

This is in contrast to UFS which relies on fsck to repair the file system. The time to run fsck is proportional to the number of files in a file system, so as disks get bigger, fsck takes longer. On average AdvFS crash recovery takes about 10 to 15 seconds.

Transaction logging also improves performance for metadata-intensive operations. Since all metadata modifications are first written to the log and can be recreated using just the log, the file system can write the actual modifications to disk at a later time. This allows the file system to wait until it can do bigger I/Os by collecting many unrelated metadata modifications into fewer I/Os.

This is in contrast to UFS which relies on ordered synchronous writes to maintain metadata consistency. The writes are ordered such that fsck can easily repair inconsistencies.

1-26 Advanced File System Concepts

Page 49: Dunix Student

Summary

CloningA clone fileset is a read-only copy of a fileset created to capture fileset data at a particular time. The contents of the clone fileset can be backed up while the original fileset remains available to users.

StripingThe stripe utility directs a zero-length file (a file with no data written to it) to be spread evenly across several volumes within a file domain. As data is appended to the file, the data is spread across the volumes. AdvFS determines the number of pages per stripe segment and alternates the segments among the disks in a sequential pattern.

Using TrashcansYou can configure your systems to retain a copy of deleted files. Trashcan directories can be attached to one or more directories within the same fileset. Once attached, any file deleted from an attached directory is automatically moved to the trashcan directory. The last version of a file deleted from a directory with a trashcan attached can be returned to the original directory with the mv command.

Reviewing AdvFS CommandsSome file domain commands are shown here.

Examining AdvFS Architecture and ComponentsAdvFS includes two kernel subsystems:

• File access subsystem

— Emulates UFS and POSIX file and directory semantics

— Uses bitfiles to implement files and directories

Command Function

mkfdmn Creates a file domain

addvol Adds a new volume to the domain

rmvol Removes a volume from the domain

balance Distributes storage over the volumes evenly

defragement Makes files contiguous if possible

Advanced File System Concepts 1-27

Page 50: Dunix Student

Summary

ifier

• Bitfile access subsystem

— Manipulates bitfiles: create, open, read, write, add and remove storage

A bitfile is an array of 8K pages, named via a tag. A tag is a unique identwithin a domain, similar to an inode number.

— Interfaces with buffer cache, VM interface, and I/O scheduling

— Provides transaction and log management

— Provides storage placement and management

— Provides domain and fileset management.

1-28 Advanced File System Concepts

Page 51: Dunix Student

Exercises

Exercises

To successfully complete the following exercises, you must be able to perform the following tasks:

• Create an AdvFS file domain with multiple disks and filesets.

• Create an AdvFS clone fileset.

• Create an AdvFS striped file.

• Add and remove volumes to a file domain.

• Defragment a file domain.

• Balance a file domain.

• Add and change fileset attributes, in particular, fileset quotas.

• Use the showfdmn and showfsets commands to obtain information about an AdvFS file domain.

• Use the showfile command to obtain information about an AdvFS file.

• Recreate the AdvFS management structure contained in /etc/fdmns using mkdir and ln.

Introducing AdvFS If completely comfortable with AdvFS commands, skip this exercise set and move forward to Exercise Set 2.

1. Create a file domain using at least two volumes that contains at least two filesets. If you have only one disk, you may have to repartition it to get the two volumes.

2. Make mount points and mount the filesets.

3. Use df -t advfs to check on the available space for each fileset.

Using Extent-Based Storage1. Add another volume to the domain. Check available space.

2. Create some large files to take up some space in the filesets. How can a fileset be prevented from taking up all of the available storage in the domain?

Cloning1. Make a clone of one of your filesets. How long did it take to create?

2. Check the contents of the clone. Does it match the contents of the original?

3. Put a new file in the original fileset. Does it appear in the clone?

Advanced File System Concepts 1-29

Page 52: Dunix Student

Exercises

4. Try to add a new file to the clone. What happened? Explain.

Using Traschcans1. Delete a file from one of your filesets. Can you get it back?

2. Create a trashcan. Associate it with your fileset.

3. Delete a file from the fileset. Can you get it back?

Striping1. Create an empty striped file. Use the showfile -x command to view the

extents of the empty file.

2. Show the extents of one of your large files.

3. Copy the large file to the empty striped file.

4. Revisit the extents of the striped file. Is there any performance difference in reading the two files?

5. Use #time cat /mnt_pnt/big_file >/dev/null.

1-30 Advanced File System Concepts

Page 53: Dunix Student

Solutions

Solutions

Introducing AdvFS1. Create a file domain using at least two volumes that contains at least two

filesets. If you have only one disk, you may have to repartition it to get the two volumes.

#

# disklabel -r /dev/rdisk/dsk0c

# /dev/rdisk/dsk0c:

type: SCSI

disk: RZ26F

label:

flags:

bytes/sector: 512

sectors/track: 57

tracks/cylinder: 14

sectors/cylinder: 798

cylinders: 2570

sectors/unit: 2050860

rpm: 5400

interleave: 1

trackskew: 40

cylinderskew: 43

headswitch: 0 # milliseconds

track-to-track seek: 0 # milliseconds

drivedata: 0

8 partitions:

# size offset fstype [fsize bsize cpg] # NOTE: values not exact

a: 131072 0 unused 0 0 # (Cyl. 0 - 164*)

b: 262144 131072 unused 0 0 # (Cyl. 164*- 492*)

c: 2050860 0 unused 0 0 # (Cyl. 0 - 2569)

d: 552548 393216 unused 0 0 # (Cyl. 492*-

1185*)

e: 552548 945764 unused 0 0 # (Cyl. 1185*-

1877*)

f: 552548 1498312 unused 0 0 # (Cyl. 1877*- 2569)

g: 819200 393216 unused 0 0 # (Cyl. 492*-

1519*)

h: 838444 1212416 unused 0 0 # (Cyl. 1519*- 2569)

#

#

#

#

# mkfdmn /dev/disk/dsk0a bruden_dom

#

#

#

# addvol /dev/disk/dsk0b bruden_dom

#

# showfdmn bruden_dom

Advanced File System Concepts 1-31

Page 54: Dunix Student

Solutions

Id Date Created LogPgs Version Domain Name

37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom

showfdmn: unable to display volume info; domain not active

#

#

#

#

# mkfset bruden_dom bruce_fset

#

#

# showfdmn bruden_dom

Id Date Created LogPgs Version Domain Name

37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom

showfdmn: unable to display volume info; domain not active

#

#

#

# mkdir /usr/bruce

# mkdir /usr/dennis

#

# mount bruden_dom#bruce_fset /usr/bruce

#

# showfdmn bruden_dom

Id Date Created LogPgs Version Domain Name

37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom

Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name

1L 131072 122432 7% on 256 256 /dev/disk/dsk0a

2 262144 261968 0% on 256 256 /dev/disk/dsk0b

---------- ---------- ------

393216 384400 2%

2. Make mount points and mount the filesets.

# mount bruden_dom#dennis_fset /usr/dennis

#

#

# showfdmn bruden_dom

Id Date Created LogPgs Version Domain Name

37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom

Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name

1L 131072 122432 7% on 256 256 /dev/disk/dsk0a

2 262144 261968 0% on 256 256 /dev/disk/dsk0b

---------- ---------- ------

393216 384400 2%

1-32 Advanced File System Concepts

Page 55: Dunix Student

Solutions

3. Use df -t advfs to check on the available space for each fileset.

# df -t advfs

Filesystem 512-blocks Used Available Capacity Mounted on

usr_domain#usr 1426112 1025604 294864 78% /usr

usr_domain#var 1426112 75678 294864 21% /var

bruden_dom#bruce_fset 393216 32 384400 1% /usr/bruce

bruden_dom#dennis_fset 393216 32 384400 1% /usr/dennis

#

Using Extent-Based Storage1. Add another volume to the domain. Check available space.

#

# addvol /dev/disk/dsk2h bruden_dom

#

# showfdmn bruden_dom

Id Date Created LogPgs Version Domain Name

37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom

Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name

1L 131072 122432 7% on 256 256 /dev/disk/dsk0a

2 262144 261968 0% on 256 256 /dev/disk/dsk0b

3 1858624 1858496 0% on 256 256 /dev/disk/dsk2h

---------- ---------- ------

2251840 2242896 0%

2. Create some large files to take up some space in the filesets. How can a fileset be prevented from taking up all of the available storage in the domain? Fileset quotas can be used to limit the disk space of a fileset.

# cp /vmunix /usr/bruce/big1

#

#

# df -t advfs

Filesystem 512-blocks Used Available Capacity Mounted on

usr_domain#usr 1426112 1025604 294864 78% /usr

usr_domain#var 1426112 75680 294864 21% /var

bruden_dom#bruce_fset 2251840 22784 2197392 2% /usr/bruce

bruden_dom#dennis_fset 2251840 22784 2197392 2% /usr/dennis

#

# showfdmn bruden_dom

Id Date Created LogPgs Version Domain Name

37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom

Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name

1L 131072 99680 24% on 256 256 /dev/disk/dsk0a

2 262144 239216 9% on 256 256 /dev/disk/dsk0b

3 1858624 1858496 0% on 256 256 /dev/disk/dsk2h

Advanced File System Concepts 1-33

Page 56: Dunix Student

Solutions

---------- ---------- ------

2251840 2197392 2%

#

# cp /vmunix /usr/bruce/big2

#

# showfdmn bruden_dom

Id Date Created LogPgs Version Domain Name

37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom

Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name

1L 131072 99680 24% on 256 256 /dev/disk/dsk0a

2 262144 239216 9% on 256 256 /dev/disk/dsk0b

3 1858624 1835744 1% on 256 256 /dev/disk/dsk2h

---------- ---------- ------

2251840 2174640 3%

#

#

# cp /vmunix /usr/dennis/big2

# showfdmn bruden_dom

Id Date Created LogPgs Version Domain Name

37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom

Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name

1L 131072 76928 41% on 256 256 /dev/disk/dsk0a

2 262144 239216 9% on 256 256 /dev/disk/dsk0b

3 1858624 1835744 1% on 256 256 /dev/disk/dsk2h

---------- ---------- ------

2251840 2151888 4%

#

#

# cp /vmunix /usr/dennis/big3

# showfdmn bruden_dom

Id Date Created LogPgs Version Domain Name

37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom

Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name

1L 131072 76928 41% on 256 256 /dev/disk/dsk0a

2 262144 216464 17% on 256 256 /dev/disk/dsk0b

3 1858624 1835744 1% on 256 256 /dev/disk/dsk2h

---------- ---------- ------

2251840 2129136 5%

#

#

# showfsets -q bruden_dom

Block (512) limits File limits

Fileset BF used soft hard grace used soft hard grace

bruce_fset -- 45536 0 0 4 0 0

dennis_fset -- 68288 0 0 5 0 0

#

#

# showfsets -q bruden_dom dennis_fset

1-34 Advanced File System Concepts

Page 57: Dunix Student

Solutions

Block (512) limits File limits

Fileset BF used soft hard grace used soft hard grace

dennis_fset -- 68288 0 0 5 0 0

#

#

#

# chfsets -B 25000 -b 30000 bruden_dom bruce_fset

bruce_fset

Id : 37f12c39.000263ea.1.8001

Block H Limit: 0 --> 30000

Block S Limit: 0 --> 25000

#

#

#

# showfsets -q bruden_dom

Block (512) limits File limits

Fileset BF used soft hard grace used soft hard grace

bruce_fset -- 45536 50000 60000 4 0 0

dennis_fset -- 68288 0 0 5 0 0

#

#

# showfsets -qk bruden_dom

Block ( 1k) limits File limits

Fileset BF used soft hard grace used soft hard grace

bruce_fset -- 22768 25000 30000 4 0 0

dennis_fset -- 34144 0 0 5 0 0

#

#

#

# cp /vmunix /usr/bruce/big4

#

# su obrien

$

$

$ cp /vmunix /usr/bruce/big5

cp: /usr/bruce/big5: Permission denied

$

$ ls -l / d /usr/bruce

drwxr-xr-x 3 root system 8192 Sep 28 17:21 /usr/bruce

$

$ su -

#

#

# chmod ugo+w /usr/bruce

# chmod ugo+w /usr/dennis

#

# su obrien

$

$

$ cp /vmunix /usr/bruce/big5

/usr/bruce: write failed, fileset disk limit reached

cp: /usr/bruce/big5: Disc quota exceeded

$

$

Advanced File System Concepts 1-35

Page 58: Dunix Student

Solutions

$ df -t /usr/bruce

Filesystem 512-blocks Used Available Capacity Mounted on

bruden_dom#bruce_fset 50000 50000 0 100% /usr/bruce

$

$ showfdmn bruden_dom

Id Date Created LogPgs Version Domain Name

37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom

Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name

1L 131072 76928 41% on 256 256 /dev/disk/dsk0a

2 262144 216464 17% on 256 256 /dev/disk/dsk0b

3 1858624 1812992 2% on 256 256 /dev/disk/dsk2h

---------- ---------- ------

2251840 2106384 6%

$

$

$ showfsets -q bruden_dom

Block (512) limits File limits

Fileset BF used soft hard grace used soft hard grace

bruce_fset *- 68288 50000 60000 none 6 0 0

dennis_fset -- 68288 0 0 5 0 0

$

$

$

$ cp /vmunix /usr/dennis/big5

$

$ df -t advfs

Filesystem 512-blocks Used Available Capacity Mounted on

usr_domain#usr 1426112 1025636 294816 78% /usr

usr_domain#var 1426112 75682 294816 21% /var

bruden_dom#bruce_fset 50000 50000 0 100% /usr/bruce

bruden_dom#dennis_fset 2251840 91040 2083632 5% /usr/dennis

$

Cloning1. Make a clone of one of your filesets. How long did it take to create?

Clones take very little time to create since no data is copied. Some metadata is created to represent the clone.

$

$ clonefset bruden_dom dennis_fset dennis_clone

Permission denied - user must be root to run clonefset.

usage: clonefset domain origSetName cloneSetName

$

$

$

$

#

# clonefset bruden_dom dennis_fset dennis_clone

#

1-36 Advanced File System Concepts

Page 59: Dunix Student

Solutions

2. Check the contents of the clone. Does it match the contents of the original?

3. Put a new file in the original fileset. Does it appear in the clone?

4. Try to add a new file to the clone. What happened?

Contents of the clone match the original fileset. New files will not appear in the clone since the clone is effectively a snapshot. A clone cannot be written to since the clone is a read-only, pseudo copy of the original fileset.

# mkdir /usr/den_clone

#

#

# showfdmn bruden_dom

Id Date Created LogPgs Version Domain Name

37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom

Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name

1L 131072 76928 41% on 256 256 /dev/disk/dsk0a

2 262144 193712 26% on 256 256 /dev/disk/dsk0b

3 1858624 1812864 2% on 256 256 /dev/disk/dsk2h

---------- ---------- ------

2251840 2083504 7%

#

# showfsets bruden_dom

bruce_fset

Id : 37f12c39.000263ea.1.8001

Files : 6, SLim= 0, HLim= 0

Blocks (512) : 68288, SLim= 50000, HLim= 60000 grc= none

Quota Status : user=off group=off

dennis_fset

Id : 37f12c39.000263ea.2.8001

Clone is : dennis_clone

Files : 6, SLim= 0, HLim= 0

Blocks (512) : 91040, SLim= 0, HLim= 0

Quota Status : user=off group=off

dennis_clone

Id : 37f12c39.000263ea.3.8001

Clone of : dennis_fset

Revision : 1

#

#

# mount bruden_dom#dennis_clone /usr/den_clone

#

#

# df -t advfs

Filesystem 512-blocks Used Available Capacity Mounted on

usr_domain#usr 1426112 1025638 294816 78% /usr

usr_domain#var 1426112 75686 294816 21% /var

bruden_dom#bruce_fset 50000 50000 0 100% /usr/bruce

Advanced File System Concepts 1-37

Page 60: Dunix Student

Solutions

bruden_dom#dennis_fset 2251840 91040 2083504 5% /usr/dennis

bruden_dom#dennis_clone 2251840 91040 2083504 5% /usr/den_clone

#

#

#

# ls -li /usr/dennis

total 45528

3 drwx------ 2 root system 8192 Sep 28 17:04 .tags

6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1

7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2

8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3

9 -rwxr-xr-x 1 obrien system 11646960 Sep 28 17:26 big5

5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group

4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user

#

#

# ls -li /usr/den_clone

total 45528

3 drwx------ 2 root system 8192 Sep 28 17:04 .tags

6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1

7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2

8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3

9 -rwxr-xr-x 1 obrien system 11646960 Sep 28 17:26 big5

5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group

4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user

#

#

#

# cat /etc/disktab > /usr/dennis/sm1

#

# ls -li /usr/dennis/sm1

10 -rw-r--r-- 1 root system 31114 Sep 28 17:32 /usr/dennis/sm1

#

# ls -li /usr/den_clone/sm1

ls: /usr/den_clone/sm1 not found

#

#

# cat /etc/disktab > /usr/den_clone/sm1

sh: /usr/den_clone/sm1: cannot create

#

#

#

# ls -li /usr/dennis

total 45559

3 drwx------ 2 root system 8192 Sep 28 17:04 .tags

6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1

7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2

8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3

9 -rwxr-xr-x 1 obrien system 11646960 Sep 28 17:26 big5

5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group

4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user

10 -rw-r--r-- 1 root system 31114 Sep 28 17:32 sm1

1-38 Advanced File System Concepts

Page 61: Dunix Student

Solutions

Using Trashcans1. Delete a file from one of your filesets. Can you get it back?

2. Create a trashcan. Associate it with your fileset.

3. Delete a file from the fileset. Can you get it back?

You can retrieve deleted files only if you have a trashcan associated with the directory.

# rm /usr/dennis/big5

#

#

#

# mkdir /usr/dennis/den_trash

#

# chmod a+w /usr/dennis/den_trash

#

# mktrashcan /usr/dennis/den_trash /usr/dennis

’/usr/dennis/den_trash’ attached to ’/usr/dennis’

#

#

# rm /usr/dennis/big3

#

# ls -li /usr/dennis

total 22815

3 drwx------ 2 root system 8192 Sep 28 17:04 .tags

6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1

7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2

11 drwxrwxrwx 2 root system 8192 Sep 28 17:38 den_trash

5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group

4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user

10 -rw-r--r-- 1 root system 31114 Sep 28 17:32 sm1

#

#

# ls -li /usr/dennis/den_trash

total 11376

8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3

#

# mv /usr/dennis/den_trash/big3 /usr/dennis

#

# ls -li /usr/dennis/den_trash

total 0

# ls -li /usr/dennis

total 34191

3 drwx------ 2 root system 8192 Sep 28 17:04 .tags

6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1

7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2

8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3

11 drwxrwxrwx 2 root system 8192 Sep 28 17:38 den_trash

5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group

4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user

10 -rw-r--r-- 1 root system 31114 Sep 28 17:32 sm1

Advanced File System Concepts 1-39

Page 62: Dunix Student

Solutions

Striping1. Create an empty striped file. Use the showfile -x command to view the

extents of the empty file.

2. Show the extents of one of your large files.

3. Copy the large file to the empty striped file.

4. Revisit the extents of the striped file. Is there any performance difference in reading the two files?

5. Use #time cat /mnt_pnt/big_file >/dev/null.

Depending on the configuration of the stripe volumes, there should be a performance improvement when using the striped file.

# touch /usr/dennis/stripe1

#

# stripe -n 2 /usr/dennis/stripe1

#

#

# showfile -x /usr/dennis/stripe1

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

c.8001 2 16 0 stripe 2 8 async 100% stripe1

extentMap: 1

pageOff pageCnt volIndex volBlock blockCnt

extentCnt: 0

extentMap: 2

pageOff pageCnt volIndex volBlock blockCnt

extentCnt: 0

#

#

# cd /usr/dennis

#

# ls -li

total 34191

3 drwx------ 2 root system 8192 Sep 28 17:04 .tags

6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1

7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2

8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3

11 drwxrwxrwx 2 root system 8192 Sep 28 17:38 den_trash

5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group

4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user

10 -rw-r--r-- 1 root system 31114 Sep 28 17:32 sm1

12 -rw-r--r-- 1 root system 0 Sep 28 17:39 stripe1

#

# showfile -x big3

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

8.8001 2 16 1422 simple ** ** async 100% big3

1-40 Advanced File System Concepts

Page 63: Dunix Student

Solutions

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 1422 2 75504 22752

extentCnt: 1

#showfile -x big2

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

7.8001 1 16 1422 simple ** ** async 100% big2

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 1422 1 57760 22752

extentCnt: 1

#

# showfile -x big1

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

6.8001 2 16 1422 simple ** ** async 100% big1

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 1422 2 52592 22752

extentCnt: 1

#

#

# cp big3 stripe1

#

# showfile -x stripe1

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

c.8001 2 16 1422 stripe 2 8 async 100% stripe1

extentMap: 1

pageOff pageCnt volIndex volBlock blockCnt

0 8 3 1093536 11392

16 8

32 8

48 8

64 8

(...) 1392 8

1408 8

extentCnt: 1

extentMap: 2

pageOff pageCnt volIndex volBlock blockCnt

8 8 1 80704 11360

24 8

40 8

56 8

(...) 1400 8

1416 6

Advanced File System Concepts 1-41

Page 64: Dunix Student

Solutions

extentCnt: 1

#

#

#

# time cat big3 > /dev/null

real 0m1.61s

user 0m0.00s

sys 0m0.36s

# time cat stripe1 > /dev/null

real 0m1.03s

user 0m0.01s

sys 0m0.36s

1-42 Advanced File System Concepts

Page 65: Dunix Student

2

AdvFS On-Disk Structures

AdvFS On-Disk Structures 2-1

Page 66: Dunix Student

About This Chapter

About This Chapter

IntroductionThis chapter presents information about the AdvFS on-disk structures.

ObjectivesTo describe AdvFS on-disk structures, you should be able to describe on-disk structures associated with these items:

• Bitfiles

• Mcells

• Extent maps

• Tags

• Fragments

• Storage bitmap bitfiles

ResourcesFor more information on topics in this chapter as well as related topics, see the following:

• Advanced File System Administration

• AdvFS Reference Pages

• Header Files

2-2 AdvFS On-Disk Structures

Page 67: Dunix Student

Introducing AdvFS On-Disk Structures

ort for

g the

Introducing AdvFS On-Disk Structures

OverviewThis section provides a platform from which to learn more details of the AdvFS on-disk structures.

• Two-level implementation of AdvFS

• Using the .tags directory

• BAS on-disk format: everything is a bitfile

• Bitfiles

• Mcells

• AdvFS file addresses

• Filesets (formerly referred to as bitfile sets)

• Reusing tags

Two-Level Implementation of AdvFSAdvFS is built using a two-layer strategy separating file access support from file storage support. The two layers are:

• File access system (FAS)

— Higher level of AdvFS

— Transforms bitfiles into normal UNIX files

— Client layer

Think of this as the standard file system components such as directories, suppquotas, mount points, and so forth.

• Bitfile access system (BAS)

— Lowest level of AdvFS providing storage support

— Contains the more complex storage structures supporting and containinmetadata for AdvFS

The following figure depicts the FAS as conceptually connected to the BAS through the .tags directory.

AdvFS On-Disk Structures 2-3

Page 68: Dunix Student

Introducing AdvFS On-Disk Structures

ith a

” ).

d

Figure 2-1: Two-Level Implementation of AdvFS

.tags DirectoryConsider the .tags directory as a way to access the BAS from the context of the FAS. All files and metadata are accessible through .tags by using the file’s tag number or a special metadata file name.

The tag number is conceptually similar to a UFS inode number. Use the ls -li or showfile commands to discover a file’s tag. Tag numbers are associated wsequence number that indicates the number of times this tag has been used.

The .tags directory provides access to files within a mounted fileset using tagnumbers. The .tags directory also provides access to the lower-level, “specialmetadata files through predefined names (M-10 is the BMT, M-6 is the RBMTThe new AdvFS on-disk viewing utilities (nvtagpg, nvbmtpg, nvfragpg, nvlogpg) bury most of the details of accessing metadata files through the .tags directory.

Each AdvFS file system has a .tags directory which allows files to be accesseby tag and sequence number.

• /Advfs_mount_point/.tags/15374

• /Advfs_mount_point/.tags/0x3c0e.8001

On Disk Structures

Bitfile Access SubsystemBAS

Contains AdvFS on-disk metadata.

Bitfile Metadata Table (BMT)Storage Bitmap (SBM)Miscellaneous Bitfile

Root Tag FileFragment BitfileFileset Tag File

(…)

File Access SubsystemFAS

Contains UNIX directory structure.

.tags directory

(connects theFAS with the

BAS)

2-4 AdvFS On-Disk Structures

Page 69: Dunix Student

Introducing AdvFS On-Disk Structures

The following example shows an inode number (tag number) being used to access a file through the .tags directory.

Example 2-1: Tag Number can Access Any File

# ls -litotal 3122894 -rwxr-xr-x 1 root system 31114 Jun 24 15:20 ob_1# # tail -3 ob_1

:of#169728:pf#35135:bf#8192:ff#1024:\:og#99458:pg#149368:bg#8192:fg#1024:\:oh#0:ph#0:bh#8192:fh#1024:

# # tail -3 /usr/.tags/22894

:of#169728:pf#35135:bf#8192:ff#1024:\:og#99458:pg#149368:bg#8192:fg#1024:\:oh#0:ph#0:bh#8192:fh#1024:

#

BAS On-Disk Format: Everything is a BitfileAll AdvFS on-disk structures can be accessed as bitfiles. This includes user files and directories as well as the AdvFS metadata structures.

Bitfiles are arrays of 8K disk pages holding user data or metadata. A series of contiguous 8K pages in a bitfile is stored as an extent.

Each bitfile is identified by its tag, which consists of a tag number or sequence number pair. Use the tag number to locate the extents of a file.

The tag2name program in /sbin/advfs can (usually) translate a tag number to a path name. Note that tag2name has a new format for V5, as shown in the following example.

AdvFS On-Disk Structures 2-5

Page 70: Dunix Student

Introducing AdvFS On-Disk Structures

Example 2-2: Tag Number 22894 Being Translated by tag2name

# /sbin/advfs/tag2name /usr/.tags/22894/usr/bruden/ob_1# # echo $PATH/sbin:/usr/sbin:/usr/bin:/usr/ccs/bin:/usr/bin/X11:/usr/local# # PATH=$PATH:/sbin/advfs# # tag2name usr_domain -S usr 22894open_vol: open for volume "/dev/disk/dsk2g" failed: Device busy# # tag2name -r usr_domain -S usr 22894<== Uses raw device (-r) bruden/ob_1#

The following figure depicts the AdvFS metadata being used to access the bitfiles by referencing a file system directory.

Figure 2-2: Using AdvFS Metadata to Translate FAS to BAS

The following figure shows the logical file as a series of 8K blocks and being represented at the lower level by one or more mcell data structures found in the BAS.

File System Directory

AdvFS Metadata

(…)file1 tag 623file2 tag 51file3 tag 893(…)

File3 on diskLBN80334

(AdvFS sees thisas a bitfile.)

2-6 AdvFS On-Disk Structures

Page 71: Dunix Student

Introducing AdvFS On-Disk Structures

Figure 2-3: BAS On-Disk Format

BitfilesBitfile characteristics include an array of 8K pages and are stored as extents:

• Groups of on-disk contiguous 8K pages

• Managed by extent maps

Bitfiles are dentified by a tag:

• Tag.sequence such as 4714.8001

• Tag number is similar to an inode number

• Sequence number functions as a generation number

All sectors are free or in a bitfile and are managed by mcell chains.

Use the showfile command to find the tag and sequence number.

McellsSeveral metadata bitfiles (RBMT, BMT) have an internal page organization consisting of a page header and a series of mcell data structures. Each mcell can contain a series of variable-length records describing various bitfile attributes and characteristics.

AdvFS locates extents by finding the file’s primary metadata cell (mcell) and stepping through the mcells to find the extent information.

Logical File

On Disk

ownergroupsizemod bits....

extent 1 extent 2(Primary) mcell 292 bytes Contains variable sized records such as;

POSIX attributesextent map records

Additional mcell(s) optional can contain more extent map records if needed

8K Pages

AdvFS On-Disk Structures 2-7

Page 72: Dunix Student

Introducing AdvFS On-Disk Structures

ry s of

is a ess

The tag number is like an inode number, but an mcell functions like an inode.

• It holds permissions, size, extent information, link count and so forth.

• Each mcell is 292 bytes and can fit 28 on an 8K page (plus a 16-byte header).

The following figure shows how a file’s tag number can locate the file’s primamcell in the BMT. The primary mcell provides access to the actual data blockthe file.

Figure 2-4: Tag Number to BMT mcell to Logical Blocks

AdvFS File AddressesThe lowest level of AdvFS is the bitfile access system (BAS). Here every filebitfile, a collection of 8192-byte pages. The higher level of AdvFS, the file accsystem (FAS), enables the bitfiles to appear as normal UNIX files. Both ls (with the -i option) and showfile will print the tag number.

AdvFS files have means of identification: a tag

• Similar to the UFS inode number

• Can be discovered with the ls -i command

— Primary mcell ID

* Component of the lower BAS layer

* Start of a linked list of one or more mcells

* Well hidden from users and system administration

Massaged tag #

Birfile Metadata Table (BMT)

.

.

.lots of mcells

.

.

.

Primary mcell describing andlocating file (will be chained).

.

.

.

2-8 AdvFS On-Disk Structures

Page 73: Dunix Student

Introducing AdvFS On-Disk Structures

file is guish r. The r. ), and

“dead”

l.

g.

Bitfile-SetBitfile-sets have the following characteristics:

• FAS fileset represents a BAS bitfile-set

• Identified by numbers

• Bitfiles are known by:

— Domain ID

— Fileset ID

— Tag or sequence number

Domain IDs can be found using the showfdmn command. Fileset IDs can be found using the showfsets command.

Reusing TagsEach time a file is created, a tag is allocated to represent that file. When the deleted, the tag is recycled. If the tag is selected for reuse, AdvFS can distinbetween the incarnations of the tag by referencing the tag’s sequence numbesequence number is 16 bits, with the leftmost bit used as the “in use” indicatoTherefore sequence number 8003 means the tag is in use (0x8 = 1000 binaryit is the third use of the tag. Of the remaining 15 bits, only 12 are used for sequencing. Therefore a tag can be reused 4096 times before it becomes a tag.

Tag numbers can be reused:

• With file creation and deletion.

• Like inode numbers.

A sequence number identifies various versions of the tag:

• Tags have initial sequence number of 8001 hexadecimal or 32769 decima

• Sequence number is incremented when tag number is reused.

• When sequence number overflows, tag is discarded.

• Leftmost bit indicates tag in use; remaining 15 bits are used for sequencin

AdvFS On-Disk Structures 2-9

Page 74: Dunix Student

Describing BAS On-Disk Metadata Bitfiles

Describing BAS On-Disk Metadata Bitfiles

OverviewTo troubleshoot on-disk problems, an understanding of the metadata bitfiles is mandatory. This section introduces the metadata bitfiles.

• Domain and volume structures

• Bitfile metadata table

• Per domain bitfiles

• Per volume bitfiles

• Per fileset bitfiles

• Reserved bitfile special names

• Metadata bitfile tags

• .tags directory entries for metadata bitfiles

Domain and Volume StructuresEach volume in an AdvFS domain consists of the following structures:

• Reserved bitfile metadata table

• Bitfile metadata table

• Storage bitmap

• Miscellaneous bitfile

In addition to the per-volume structures, each domain also has the following structures. For these structures there is only one per domain and they can reside on any volume in the domain:

• Transaction log

• Root tag file

The following figure illustrates the BAS on-disk metadata bitfiles.

2-10 AdvFS On-Disk Structures

Page 75: Dunix Student

Describing BAS On-Disk Metadata Bitfiles

f ead

Figure 2-5: BAS On-Disk Metadata Bitfiles

Per Domain BitfilesEach domain is supported by the following bitfiles:

• The on-disk log:

— Contains the transaction log.

— Is usually 4MB in size.

• The root tag file:

— Lists filesets.

— Is one page in size.

The following example uses the nvtagpg command to display the root tag file othe usr_domain. The -r specifies that the raw device be used for access instof the block device eliminating a device busy error.

Example 2-3: Displaying the Root Tag File

# nvtagpg -r usr_domain ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 96 root TAG page 0--------------------------------------------------------------------------currPage 0numAllocTMaps 3 numDeadTMaps 0 nextFreePage 0 nextFreeMap 5

tMapA[1] tag 1 seqNo 1 primary mcell (vol,page,cell) 1 0 1 usrtMapA[2] tag 2 seqNo 1 primary mcell (vol,page,cell) 1 0 13 vartMapA[3] tag 3 seqNo 1 primary mcell (vol,page,cell) 2 2 4 ob_fset#

Reserved BitfileMetadata Table

Storage Bitmap

Root Tag File

On-Disk Log

MISC Bit File

Reserved BitfileMetadata Table

Storage Bitmap

MISC Bit File

Tag File: fileset A

Fragment File: fileset A

Per Volume

Per Domain

Per Fileset

Bitfile Metadata Table Bitfile Metadata Table

AdvFS On-Disk Structures 2-11

Page 76: Dunix Student

Describing BAS On-Disk Metadata Bitfiles

me

Per Volume BitfilesEach volume is supported by the following bitfiles:

• Reserved bitfile metadata table (RBMT)

— Contains mcells for reserved bitfiles

— Last mcell in each RBMT page links to the next page

— Eliminates BMT fragmentation problems

The following example uses the nvbmtpg command to display summary information about the RBMT.

Example 2-4: Displaying RBMT Summary Information

# nvbmtpg -r -R usr_domain ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 32 RBMT page 0--------------------------------------------------------------------------There is 1 page in the RBMT on this volume.There are 19 free mcells in the RBMT on this volume.==========================================================================DOMAIN "usr_domain" VDI 2 (/dev/rdisk/dsk4b) lbn 32 RBMT page 0--------------------------------------------------------------------------There is 1 page in the RBMT on this volume.There are 19 free mcells in the RBMT on this volume.

• Bitfile metadata table (BMT)

— Contains all the mcells for nonreserved bitfiles

— Grows as new files are created

The following example displays summary information about the BMT for voluone in the usr_domain.

Example 2-5: Displaying BMT for Volume 1 of usr_domain

# nvbmtpg -r usr_domain 1 οDOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 48 BMT page 0--------------------------------------------------------------------------There are 1025 pages in the BMT on this volume.The BMT uses 2 extents (out of 33) in 2 mcells.

• Storage bitmap (SBM)

Contains 1 bit for every 8K bytes (1 bit per 1K in Tru64 UNIX V4.0)

• Miscellaneous bitfile

Contains bootblocks and is four pages in size

2-12 AdvFS On-Disk Structures

Page 77: Dunix Student

Describing BAS On-Disk Metadata Bitfiles

e

bit is

eded

ary

Per Fileset BitfilesA bitfile-set is BAS nomenclature for FAS fileset. Most references to the term bitfile-set are being replaced with the term fileset to avoid unnecessary confusion. Each fileset is supported by the following bitfiles:

• Tag file (not .tags):

— Translates tag number into location of primary mcell (within BMT) for thappropriate file

— Formerly called the tag directory file

• Tags can be reused as files are deleted:

— Limited to size of associated sequence number

— 8001 is a typical sequence number showing that this tag is in use (left set) and it is in use for the first time (001)

— Limits tag reuse to ~4k times before tag is dead

• Tag file consists of 8K pages with 1022 tagmap entries (8 bytes each) precby a 16-byte header:

— Tagmap entry contains the sequence number, volume index, and primmcell ID (BMT page number and cell number within page)

The following figure shows information from the fileset tag file locating the primary mcell for a file. There may be a chain of mcells describing the file.

AdvFS On-Disk Structures 2-13

Page 78: Dunix Student

Describing BAS On-Disk Metadata Bitfiles

Figure 2-6: Fileset Tag Directory Locating Primary Mcell

The following example uses the tag number of a file to locate the tag directory file entry for the file. The volume, page, and cell information found in the tag directory file is used to access the primary mcell of the file.

Example 2-6: Finding Primary Mcell through Tag Directory

# ls -li big122896 -rwxr-xr-x 1 root system 13729520 Jun 24 16:53 big1# # # nvtagpg -r usr_domain -T 1 -t 22896 ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 2055136 "usr" FRAG page 22--------------------------------------------------------------------------currPage 22numAllocTMaps 1022 numDeadTMaps 0 nextFreePage 0 nextFreeMap 0

tMapA[412] tag 22896 seqNo 1 primary mcell (vol,page,cell) 1 951 15 # # # #

Bitfile Metadata Table Fileset Tag File (M1)

.

.

.

tag 893 –Sequence # 3,Volume # 1,

BMT page 811,mcell 7

.

.

.

BMT...

page 811, mcell 7

Extent 80334, 50 pages

.

.

.

2-14 AdvFS On-Disk Structures

Page 79: Dunix Student

Describing BAS On-Disk Metadata Bitfiles

# # nvbmtpg -r usr_domain 1 951 15 ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 23552 BMT page 951--------------------------------------------------------------------------CELL 15 next mcell volume page cell 2 2 9 bfSetTag,tag 1,22896

RECORD 0 bCnt 92 BSR_ATTRtype BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTStype BSXMT_APPEND chain mcell volume page cell 1 951 16firstXtnt mcellCnt 2 xCnt 2bsXA[ 0] bsPage 0 vdBlk 573376 (0x8bfc0)bsXA[ 1] bsPage 21 vdBlk -1

RECORD 2 bCnt 92 BMTR_FS_STATst_mode 100755 (S_IFREG) st_uid 0 st_gid 0 st_size 13729520st_nlink 1 dir_tag 22893 st_mtime Thu Jun 24 16:53:06 1999

A fragment bitfile:

• Contains small files including the last parts of small files.

• Varies in size with number of small files.

• Has tag number 1.

Reserved Bitfile Special NamesThe .tags directory can be used with many special file names which provide user-level access to the BAS bitfiles. These are special names in that they are not visible to standard commands. They have meaning to AdvFS aware software. Any command that uses the VFS I/O component will be AdvFS aware.

The following bitfiles are all accessible under the .tags directory.

Name Function One Per

M-6, M-12, ... Reserved Bitfile Metadata Table Volume

M-7, M-13, ... Storage Bitmap Volume

M-8, M-14, ... Root Tag File Domain

M-9, M-15, ... Transaction Log Domain

M-10, M-16, ... Bitfile Metadata Table Volume

M-11, M-17, ... Miscellaneous Bitfile Volume

M1 Fileset tag file (not .tags) for fileset #1.

Fileset

M2 Fileset tag file (not .tags) for fileset #2.

Fileset

M3 Fileset tag file (not .tags) for fileset #3.

Fileset

(...)

AdvFS On-Disk Structures 2-15

Page 80: Dunix Student

Describing BAS On-Disk Metadata Bitfiles

Mn Fileset tag file (not .tags) for fileset #n.

Fileset

1 Fragment Bitfile Fileset

2 Fileset’s Root Directory Fileset

3 .tags Directory Fileset

4 User Quota File Fileset

5 Group Quota File Fileset

6 User File with Tag # 6 Fileset

7 User File with Tag # 7 Fileset

(...)

n User File with tag # n Fileset

* Next instance of M-n file is M-(n+6).- Not true for Mn files!

Metadata BitfileTagsNonreserved bitfiles (user files) are assigned tags from their tag file. Reserved bitfiles (metadata bitfiles) do not have tags assigned from a tag file because they exist before a tag file exists (such as the root tag file) and because their mcell locations must always be in a known place. Therefore, reserved tags are calculated as follows:

tag = - (reserved-bitfile-primary-mcell-number + (volume-index * 6))

Since tags are used to locate the bitfile’s primary mcell, the BAS translates a reserved tag to the primary mcell by reversing the above calculation (translating the tag to a volume number and an mcell address):

volume-index = tag / 6

reserved-bitfile-primary-mcell-number = -tag % 6

So, for RBMT tags -6 and -12:

-6: -6/6 == volume-index:1

-6: -6%6 == mcell:0

-12: -12/6 == volume-index:2

-12: 12%6 == mcell:0

Tags for reserved bitfiles of virtual disk i are:

tag = - (magic-number-shown-in table + (vol_index * 6)

2-16 AdvFS On-Disk Structures

Page 81: Dunix Student

Describing BAS On-Disk Metadata Bitfiles

Metadata bitfile tags can be printed in unusual tags, which effectively translate to negative numbers.

fffffffa.0 RBMT for disk 1, fffffff3.0 SBM for disk 2

.tags for Directory Entries for Metadata BitfilesThe special file names in the .tags directory can take several forms. M-6 is more readable than -6, which would also work. Start with an M:

• Use the negative number for virtual disk-specific files

• Use the fileset ID for fileset tagfiles

For example, if /usr is an AdvFS fileset, use the values in the table.

Bitfile Metadata Table The bitfile metadata table (BMT) holds the support data for user files and directories. It contains location information, permissions and other stats, extent information, fragment location, and other descriptive data. The metadata describing the BMT itself is contained in the RBMT. This avoids the BMT fragmentation problems seen in V4. It also eliminates the need for the -x and -p options on several commands.

Table 2-1: Metadata Bitfile Tags

Reserved File Formula Disk 1 Disk 2

RBMT - (0 + (vol * 6)) -6 -12

SBM - (1 + (vol * 6)) -7 -13

Root tag directory - (2 + (vol * 6)) -8 -14

Log - (3 + (vol * 6)) -9 -15

BMT - (4 + (vol * 6)) -10 -16

Misc Bitfile - (5 + (vol * 6)) -11 -17

Table 2-2: .tags for Metadata Bitfiles

File Description

/usr/.tags/1 Fragment bitfile

/usr/.tags/M-6 RBMT of disk 1

/usr/.tags/-6 RBMT of disk 1 also

/usr/.tags/M-15 Log of disk 2

/usr/.tags/M2 Tag file for second fileset

AdvFS On-Disk Structures 2-17

Page 82: Dunix Student

Describing BAS On-Disk Metadata Bitfiles

BMT is represented by a file found under the .tags directory. The special file name is M-10. It contains a series of 8K pages just like any other AdvFS file, however, its pages contain mcells (292 bytes each) and header information.

Each mcell contains one or more variable-length records describing various file attributes, extents, permissions, fragment info, and so forth.

The bitfile metadata table can grow just as any file can grow; it just adds another extent. It starts with slightly more than 1M (can be tailored).

BMT is created when the mkfdmn command is issued. There is one BMT for each volume in the domain.

The BMT stores bitfile metadata, including:

• Bitfile attributes

• Bitfile extent maps

• Bitfile set attributes

• FAS file attributes including the POSIX file stats

The BMT is an array of 8KB pages where each page consists of a header and an array of fixed-size metadata cells (mcells), where each mcell contains one or more variable-length records. The records are typed (for example, bitfile attributes, or extent map).

BAS record types are defined in src/kernel/msfs/msfs/bs_ods.h and ms_public.h.

The BMT contains all mcells for all files other than the reserved bitfiles:

• User files

• User directories

The mcells for the reserved bitfiles are in the RBMT.

RBMT and BMT:

• First mcell describes itself.

• Grows using extents as more mcells are needed.

• RBMT reserves the last mcell on each page to chain to other pages of mcells.

The following figure shows the on-disk layout for most of the reserved bitfiles.

2-18 AdvFS On-Disk Structures

Page 83: Dunix Student

Describing BAS On-Disk Metadata Bitfiles

Figure 2-7: Reserved Bitfiles On Disk Layout

Mcell RecordsThese characteristics describe mcells and the records within them.

• Inodes of AdvFS are 28 fixed-size (292 byte) mcells packed into 8K pages

• One or more linked mcells describe bitfiles

• First mcell in list is primary mcell

• Each mcell contains variably sized records describing attributes of the bitfile

Miscellaneous Bitfile (M-11)Pages 0, 1

Sectors 0-31

RBMT (M-6)Page 0

Sectors 32-47

BMT (M-10)Page 0

Sectors 48-63

Miscellaneous Bitfile (continued)Pages 2,3

Sector 64-95

Root Tag Directory (M-8)Page 1

Sectors 96-111

Storage Bitmap (M-7)(1 bit per 8k cluster)

Sectors 112-?

Transaction Log (M-9)512 PagesSectors ?

Fileset Tag Directory File (M1)8 PagesSectors ?

Mount Point Directory for Fileset (2)1 PageSector ?

.tags Directory (3)1 PageSector ?

Quota.user (4)1 PageSector ?

Quota.Group (5)1 PageSector ?

AdvFS On-Disk Structures 2-19

Page 84: Dunix Student

Describing BAS On-Disk Metadata Bitfiles

Record types contained in the BMT and the RBMT include:

• Extent maps (of various kinds)

• Bitfile attributes (clone, original, and so forth)

• Domain attributes

• Virtual Disk attributes (disk ID, disk index)

• Fragment attributes

• POSIX file stats (permissions, size, link count)

• Symbolic link targets

Mcell Page StructureThis figure illustrates the structure of a page full of mcells.

Figure 2-8: Mcell Page Structure

mcell header

record

record

record

mcell header

record

record

record

mcell header

record

record

record

page header

mcell

mcell

mcell

Page

28 mcellsper page

variable sizedrecords

2-20 AdvFS On-Disk Structures

Page 85: Dunix Student

Describing BAS On-Disk Metadata Bitfiles

RBMT Page 0RBMT page 0 starts at sector 32 (LBN 32) and contains these primary mcells:

• Mcell 0 reserved bitfile metadata table (RBMT)

• Mcell 1 storage bitmap (SBM)

• Mcell 2 root tag file (optional, one per domain)

• Mcell 3 log (optional, one per domain)

• Mcell 4 bitfile metadata table (BMT)

• Mcell 5 miscellaneous bitfile

RBMT also contains all secondary mcells (extent maps) for the BMT.

BMT Page 0The BMT page 0 includes the head of the BMT page free list. Any BMT page that contains at least one free mcell is on this free list. The free list head is maintained in the first mcell in BMT page 0. Note that BMT page 0 is not included in this free list.

BMT page 0 starts at Sector 48. Mcell 0 is the head of the BMT page free list. It contains mcells for nonreserved bitfiles (that is user files and directories). All other BMT pages are found via the RBMT.

BMT Page FormatEach 8192-byte page is comprised of a 16-byte header followed by twenty-eight 292-byte mcells.

The BMT header consists of:

• Pointer to next free mcell on page

• Pointer to next page with free mcells

• Number of free mcells on the page

• Page number (within BMT)

• AdvFS version (now 4)

AdvFS On-Disk Structures 2-21

Page 86: Dunix Student

Describing BAS On-Disk Metadata Bitfiles

bers

The following example shows an excerpt from bs_ods.h.

Example 2-7: BMT Page Structure

typedef struct bsMPg { bfMCIdT nextfreeMCId; /* Next free MCId on the page */ uint32T nextFreePg; /* Next page in the mcell free list */ uint32T freeMcellCnt; /* Number of free mcells on this pg */ uint32T pageId : 27; /* Page number */ uint32T megaVersion: 5; /* Overall structure version */ struct bsMC bsMCA[BSPG_CELLS]; /* Array of Bs Cells */} bsMPgT;

Mcell AddressesMcells can be located in several ways. The mcell is most often found through the tag file; the BMT page number and mcell number (within page) are all that is needed. Mcells are addressed by a 32-bit mcell ID, bfMCIdT.

• 27 bits give the mcell’s BMT page number.

• 5 bits give the mcell’s position within its page.

Every bitfile has a primary mcell.

• Tagfiles map (tag file).

• Tag numbers can be mapped to primary mcell locations.

Use the nvtagpg command to find out how this is done.

Reserved Mcell AddressesWithin RBMT page 0, the slot numbers listed are used to calculate the tag numfor the reserved bitfiles:

0 RBMT itself

1 SBM (storage bitmap)

2 Root tag file

3 Transaction log

4 BMT

5 Miscellaneous bitfile

2-22 AdvFS On-Disk Structures

Page 87: Dunix Student

Describing BAS On-Disk Metadata Bitfiles

Mcell FormatEach mcell begins with a 24-byte header:

• 32-bit ID of the next mcell in the chain and virtual disk containing next mcell

• Position of this mcell within the chain

• Tag number of the bitfile

• Tag number of the fileset

Each mcell has 268 bytes remaining for mcell records.

The following bsMC structure is found in msfs/msfs/bs_ods.h:

typedef struct bsMC { bfMCIdT nextMCId; /* Link to next mcell */ uint16T nextVdIndex; /* vd index of next mcell */ uint16T linkSegment; /* segment in link, starts at zero */ bfTagT tag; /* Tag this mcell is assigned to */ bfTagT bfSetTag; /* tag of this bitfile’s bf set dir */ char bsMR0[BSC_R_SZ]; /* Records */} bsMCT;

Mcell RecordsMcell contents are arranged in variable-length records. They vary in size and type and begin with a 4-byte header.

• 2-byte count of record size

• 1-byte type field

There are now about 20 record types:

• 1-byte version number for different versions of a type

• Null record whose size is 4 and type is 0

This is the end-of-records indicator.

The following bsMR structure is found in msfs/msfs/bs_ods.h:

typedef struct bsMR { uint32T bCnt : 16; /* Count of bytes in record */ uint32T type : 8; /* Type of structure contained by record */ uint32T version : 8; /* Version of the record’s type */} bsMRT;

AdvFS On-Disk Structures 2-23

Page 88: Dunix Student

Describing BAS On-Disk Metadata Bitfiles

Utilities for Viewing RecordsThe table shows the utilities for viewing records.

These utilities are found in /sbin/advfs.

nvbmtpg Displays all records of a BMT (or RBMT) page

nvfragpg Displays the pages of an AdvFS fragment file

nvlogpg Displays the pages of the log file

nvtagpg Displays the pages of a tag file

vsbmpg Displays pages from the SBM

vfilepg Displays the pages of any AdvFS file

savemeta Takes a snapshot of all of a domain’s metadata

2-24 AdvFS On-Disk Structures

Page 89: Dunix Student

Using Extent Maps

ells extent

ion, ds.

t to

Using Extent Maps

OverviewExtent maps are stored in mcells linked to a bitfile’s primary mcell. Since mcare of limited size, an extent map may span several mcells (these pieces of themap in different mcells are called subextent maps).

When a bitfile is created, AdvFS allocates a primary extent map record in theprimary mcell. When an extra extents record is filled with extent map informatthe extent map can be extended indefinitely with additional extra extent recor

• Extent maps for nonreserved files

• Extent maps for reserved files

• Encoding of extents

Extent Maps for Nonreserved FilesThese characteristics are common to extent maps for nonreserved files:

• Primary extent map record

— Within the primary mcell

— Allocated when file gets a page

— If full, points to an extra extent map

• Extra extent map record is allocated as the file grows

For striped files, extent map records use a shadow extent map and may poinmore than one disk.

Extent Maps for Reserved FilesExtent maps for reserved files have these characteristics:

• Primary extent map (if full, points to extra extent map)

• Extra extent map (usually only needed for the BMT itself)

All records for reserved files are in the RBMT.

Encoding of ExtentsThese characteristics describe extents:

• Extents information is captured in two fields

— Page number within bitfile

— Block number within virtual disks

AdvFS On-Disk Structures 2-25

Page 90: Dunix Student

Using Extent Maps

d by lls.

• Size of one extent is inferred from the next

— Compute difference between the page numbers

— Place -1 in block number to indicate a hole (-2 for clone hole)

• Try out the following:

— Page = 0, block = 8000

— Page = 100, block = -1

— Page = 200, block = 9600

— Page = 300, block = -1

This format can be confusing. You will often see that an extent count displayethe showfile -x command is one less than seen when visiting the file’s mceThe above sequence indicates a file with three extents:

• First 100 pages start at LBN 8000,

• Next 100 pages are empty (a hole in the file)

• Final 100 pages start at LBN 9600.

The following structure is found in /msfs/msfs/bs_ods.h.

Example 2-8: Extent Structure

typedef struct bsXtnt { uint32T bsPage; /* Bitfile page number */ uint32T vdBlk; /* Logical (disk) block number */} bsXtntT;

2-26 AdvFS On-Disk Structures

Page 91: Dunix Student

Using Tags

The

Using Tags

OverviewThis section discusses the function of the fileset tag files. Do not confuse these files with the .tags directory.

• Tag file characteristics

• Tag file page

• Tagmap entries

• Root tag bitfile

• Fileset tag file

• Cloning through fileset tag file

• Utility for viewing tag file

• UNIX directories

• POSIX files

• AdvFS tag files and migration

Tag File CharacteristicsTag files have these characteristics:

• Bitfiles are identified by tags

• Tag files are:

— Arrays of tagmap entries

— Indexed by tag number

Tag file entries contain:

• Sequence number (high bit set if in use)

• Volume index

• Mcell ID within volume

Tag files are used to translate a bitfile tag to the location of its primary mcell. file is indexed by tag number and the entry contains the logical address of theprimary mcell, which is a tuple of the following format:

<volume index, BMT page number, mcell's index within the BMT page>

AdvFS On-Disk Structures 2-27

Page 92: Dunix Student

Using Tags

ks

the

A bitfile tag consists of a tag number and a sequence number. Whenever a bitfile is deleted, its tag is placed back on the free list. However, for various consistency reasons (like crash recovery) AdvFS cannot reuse the tag unless it is made unique from previous uses of that tag. Therefore, each time a tag is reused, its sequence number is incremented to differentiate it from the previous use of the same tag. The sequence number has a limited number of bits, so a tag can be used 4096 times and then it becomes a dead tag, never to be reused again.

Tag files introduce an extra level of overhead in accessing file data.

The following figure depicts an FAS file access request directed to the tag file to get the location of the primary mcell. The file’s mcells will point to the logical blocof the file.

Figure 2-9: File Access Through Tag File

The following figure shows the file’s data moving physically with no changes inFAS level structure information.

File System Directory Fileset Tag File (M1)(…)

file1 tag 623file2 tag 51file3 tag 893

(…)

. . .tag 893 –

Sequence # 3,Volume # 1,

BMT page 811,mcell 7

. . .

BMT. . .

page 811, mcell 7Extent 80334, 50 pages

. . .

File on disk LBN 80334

2-28 AdvFS On-Disk Structures

Page 93: Dunix Student

Using Tags

Figure 2-10: Tag File Allowing Transparent Data Move

Tag File PageThe tag file page starts with a five-field, 16-byte header:

• Number of this page (for sanity checking)

• Next page with free tagmap entries

• Next free tagmap within this page

• Number of allocated tagmaps

• Number of dead tagmap entries

The tag file page is followed by 1022 tagmap entries

The following structure is found in msfs/msfs/bs_ods.h.

Example 2-9: Tag File Page Header Structure

typedef struct bsTDirPgHdr { uint32T currPage; /* page number of this page */ uint32T nextFreePage; /* next page having free TMaps */ uint16T nextFreeMap; /* index of next free TMap, 1 based */ uint16T numAllocTMaps; /* count of allocated tmaps */ uint16T numDeadTMaps; /* count of dead tmaps */ uint16T padding;} bsTDirPgHdrT;

File System Directory Fileset Tag File (M1)(… )

file1 tag 623file2 tag 51file3 tag 893

(… )

. . .tag 893 –

Sequence # 3,Volume # 1,

BMT page 811,mcell 7

. . .

BMT. . .

page 811, mcell 7Extent 88526, 50 pages

. . .

File on disk LBN 88526

AdvFS On-Disk Structures 2-29

Page 94: Dunix Student

Using Tags

Tagmap EntriesThree different formats for entries in a tag file:

• Head of free tagmap list

— Stored in first slot of page 0

— Points to:

* First page with a free tagmap entry

* First uninitialized page

• Element of free tagmap list

— Sequence number

— Pointer to next free entry on this page

• Allocated tagmap entry

— Sequence number

— Virtual disk of bitfile

— Mcell ID of bitfile

The following structure is found in msfs/msfs/bs_ods.h:

Example 2-10: Tagmap Structures

Note that the * sequence number is only twelve bits. The 4-tuple * * <domain id, bitfile set tag, bitfile tag, sequence number> * * must be unique for all time. When the sequence number wraps the * slot containing the tagmap struct becomes permanently unavailable. */typedef struct bsTMap {

union { /* * First tagmap struct on page zero only. */ struct { uint32T freeList; /* head of free list */ uint32T unInitPg; /* first uninitialized page */ } tm_s1;

/* * Tagmap struct on free list. */ struct { uint16T seqNo; /* must overlay seqNo in tm_s3 */ uint16T unused; /* padding to 4 byte boundary */

2-30 AdvFS On-Disk Structures

Page 95: Dunix Student

Using Tags

et’s is copy

and ne

uint32T nextMap; /* next free tagmap struct within page */ } tm_s2;

/* * In use tagmap struct. */ struct { uint16T seqNo; /* must overlay seqNo in tm_s2 */ uint16T vdIndex; /* virtual disk index */ bfMCIdT bfMCId; /* bitfile mcell id */ } tm_s3; } tm_u;} bsTMapT;

Root Tag FileThe root tag file:

• Contains entries for each fileset of the domain.

• Contains mcell IDs of fileset tagfiles.

• Can find the list of domain filesets

Since you can have many filesets within a domain, there must be a way to locate the tag file that is pertinent to each fileset. This is accomplished using the root tag file.

Fileset Tag File Each fileset has its own fileset tag file.

• Maps fileset bitfiles to primary mcells

• Has these special fileset tags

— 1 fragment bitfile

— 2 root directory

— 3 .tags

— 4 user quota file

— 5 group quota file

Cloning through Fileset Tag FileCloning is accomplished by creating a new fileset and copying the original filestag file information. Note that the original fileset’s data is not copied unless italtered after the clone is created. If the original data is altered, the clone gets aof the unchanged data only. A clone is a read-only mechanism to be created used for short-term operations (such as backups) and then removed. The closhould be recreated if needed again.

AdvFS On-Disk Structures 2-31

Page 96: Dunix Student

Using Tags

A clone fileset is a read-only, virtual copy of the data as it existed at the time the clone was created.

As original data changes while the clone exists, a:

• Copy of the original data must be created.

• New mcell must be allocated in the BMT.

The appropriate entry in the new copy of the fileset tag file must be updated to point to the cloned page’s new mcell in the BMT.

The figure shows the fileset tag file without a clone (on the left) and then shows how the structures change when a clone is created.

Figure 2-11: Fileset Tag File Before and After Cloning

No data is copied unless a change is made to the original data. Only thedata about to be changed is copied. A clone is effectively a snapshot of the data at a known time.

The following figure depicts the fileset tag file after a change has been made to the original data.

No Clone After clonefset command

Fileset Tag File (M1)

. . . . . tag 85, mcell 14 . . . . . . .

BMT

mcell 14, LBN 919

original data(LBN 919)

Fileset Tag File (M2)(clone)

. . . . . tag 85, mcell 14 . . . . . . .

Fileset Tag File (M1)

. . . . . tag 85, mcell 14 . . . . . . .

2-32 AdvFS On-Disk Structures

Page 97: Dunix Student

Using Tags

Figure 2-12: Clone Structures After Data Write

Utility for Viewing Tag FilesThe nvtagpg utility prints formatted pages of a root tag file or a fileset tag file.

The example shows how to use the root tag file.

Example 2-11: Displaying Root Tag File Information

# nvtagpg -r usr_domain ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 96 root TAG page 0--------------------------------------------------------------------------currPage 0numAllocTMaps 3 numDeadTMaps 0 nextFreePage 0 nextFreeMap 5

tMapA[1] tag 1 seqNo 1 primary mcell (vol,page,cell) 1 0 1 usrtMapA[2] tag 2 seqNo 1 primary mcell (vol,page,cell) 1 0 13 vartMapA[3] tag 3 seqNo 1 primary mcell (vol,page,cell) 2 2 4 ob_fset

No Clone After Data has been Changed

Fileset Tag File (M1)

. . . . . tag 85, mcell 14 . . . . . . .

BMT

mcell 14, LBN 919

mcell 22, LBN 1214

original data(LBN 919)includingchanges

Fileset Tag File (M2)(clone)

. . . . . tag 85, mcell 22 . . . . . . .

Fileset Tag File (M1)

. . . . . tag 85, mcell 14 . . . . . . .

original data(LBN 1214)

copied beforechanges weremade in LBN

919

AdvFS On-Disk Structures 2-33

Page 98: Dunix Student

Using Tags

The example shows how to use the fileset tag file.

Example 2-12: Displaying Fileset Tag File

# nvtagpg -r usr_domain usr ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 1047552 "usr" FRAG page 0--------------------------------------------------------------------------currPage 0numAllocTMaps 1021 numDeadTMaps 0 nextFreePage 23 nextFreeMap 0

tMapA[1] tag 1 seqNo 1 primary mcell (vol,page,cell) 1 0 3 tMapA[2] tag 2 seqNo 1 primary mcell (vol,page,cell) 1 0 4 tMapA[3] tag 3 seqNo 1 primary mcell (vol,page,cell) 1 0 6 tMapA[4] tag 4 seqNo 1 primary mcell (vol,page,cell) 1 0 7 tMapA[5] tag 5 seqNo 1 primary mcell (vol,page,cell) 1 0 8

(...)

Given a particular tag number, divide by 1022:

• Quotient is the page

• Remainder is tagmap slot

The example shows how to use the individual file’s tag file entry.

Example 2-13: Displaying Individual File’s Tag File Entry

# ls -li big122896 -rwxr-xr-x 1 root system 13729520 Jun 24 16:53 big1# # nvtagpg -r usr_domain -T 1 -t 22896 ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 2055136 "usr" FRAG page 22--------------------------------------------------------------------------currPage 22numAllocTMaps 1022 numDeadTMaps 0 nextFreePage 0 nextFreeMap 0

tMapA[412] tag 22896 seqNo 1 primary mcell (vol,page,cell) 1 951 15 # # bc 22896/10222222896%1022412^D #

2-34 AdvFS On-Disk Structures

Page 99: Dunix Student

Using Tags

UNIX DirectoriesUNIX directories are contained in standard bitfiles.

AdvFS format is similar to UFS format except the:

• Tag numbers replace inode numbers.

• 64-bit tag.sequence ID is hidden in the padding at the end of each component entry.

Two levels of directories support file migration between disks.

AdvFS directories have the same basic structure as UFS directories, except that the complete bitfile tag is stored after the file name in each entry instead of an inode number. Directories are extended on one 8KB page at a time. Each 8KB page is subdivided into 512-byte sections. Each section contains variable-length entries that translate a file name to an AdvFS bitfile tag (unique identifier). Each entry has the following format:

• Tag number: 32 bits

• Entry length: 16 bits

• Name length: 16 bits

• Name: variable-length, zero-padded to nearest 32-bit boundary

• Tag+sequence#: 64 bits

POSIX FilesThe following figure shows how a directory file points to a fileset tag file, to an entry in the BMT and ultimately to a POSIX file.

AdvFS On-Disk Structures 2-35

Page 100: Dunix Student

Using Tags

Figure 2-13: Relationship to POSIX Files

AdvFS Tagfiles and MigrationWhy does AdvFS have this extra level in the lookup path to a bitfile’s metadata? Tag files are key structures that enable AdvFS to migrate bitfiles in a way that is transparent to the FAS. Migration relies on two key features:

• The ability to move a bitfile’s data transparently; this is supported by the buffer cache and I/O scheduling algorithms.

• The ability to move a bitfile’s metadata to another volume; tag files enable this.

When migration moves a bitfile’s metadata to another volume, it simply updates the bitfile’s tag directory entry to point to the new metadata location. This makes migration a purely BAS issue or feature since tag directories are part of the BAS layer and the FAS layer’s structures are left unchanged.

Directory File

Fileset Tag File

BMT

File Data

2-36 AdvFS On-Disk Structures

Page 101: Dunix Student

Assigning Fragments

Assigning Fragments

OverviewA file that is not an exact multiple of 8K in size will most likely have a fragment assigned to it. This fragment will hold the excess data in a piece of disk storage represented in a fragment bitfile.

• Fragment bitfile

• Fragment groups

• Fragment header

• Fragment utilities

• Fragments and files

Fragment BitfileThese are characteristics of fragment bitfiles:

• One per fileset

• Contains small (< 8K) ends of files

• Allocated in 8K (16 sector) units

If the file is less than 100K, a fragment is used if necessary.

This biases the fragmentation algorithm toward smaller files. Why make special arrangements for the final fragment (<8K) of a 300M file?

The fragment bitfile is divided into a series of 128Kb fragment groups.

Fragment GroupsEach fragment group consists of fragments of a particular size (free, or 0K, 1K, 2K,...7K)

Each fragment group has:

• One page fragment header.

• Fragments sized for the group.

Fragment addresses are usually:

• In 1Kpages.

• Relative to start of fragment bitfile.

AdvFS On-Disk Structures 2-37

Page 102: Dunix Student

Assigning Fragments

Fragments are not extents.

• An extent is a contiguous range of 8K pages.

• A fragment is a chunk of disk space between 1K and 7K in size.

The figure shows the fragment bitfile locating various fragment groups.

Figure 2-14: Fragment Bitfile Locating Fragment Groups

Fragment HeaderFields in the fragment header (1024 bytes) include:

• Pointer to next fragment group of this type with free space

• Page number of this page (a sanity check)

• Type of this group

• Number of free fragments

• Fileset ID

• Version

• List of free fragments in the group

The following structure is found in msfs/msfs/bs_bitfile_sets.h.

1k Listhead

2k Listhead

3k Listhead

4k Listhead

5k Listhead

6k Listhead

7k Listhead

List of 2k frags| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |

List of 4k frags| | | | | | | | | | | | | | | |

2-38 AdvFS On-Disk Structures

Page 103: Dunix Student

Assigning Fragments

Example 2-14: Fragment Group Header Structure

typedef struct grpHdr { uint32T nextFreeFrag; /* frag index (valid only when "version == 0") */ uint32T lastFreeFrag; /* frag index (valid only when "version == 0") */ uint32T nextFreeGrp; /* page number */ uint32T self; /* this group’s starting page number */ bfFragT fragType; /* type of frags in this group */ int freeFrags; /* number of free frags in the group */ bfSetIdT setId; /* bitfile-set’s ID */

/* * the following fields were added in ADVFS v3.0 * they were all zeros in pre-ADVFS v3.0 */

unsigned int version; /* metadata version pre-ADVFS v1.0 == 0, ADVFS v3.0 == 1 */ uint32T firstFrag; /* frag index */

/* * the following is used as a map of the free frags in the group. * it is a linked list where element zero (0) is used as the head * of the list (since frag 0 is always the group header it can * never be allocated so element zero would otherwise be unused) */ unsigned short freeList[ BF_FRAG_GRP_SLOTS ];} grpHdrT;

Fragment UtilitiesThe nvfragpg utility displays statistics about fragment use.

This example shows summary statistics.

Example 2-15: Fragment Group Statistics Display Using nvfragpg

# nvfragpg -r usr_domain usr ==========================================================================DOMAIN "usr_domain" --------------------------------------------------------------------------reading 438 frag group headers, 400 headers readfrag type free 1K 2K 3K 4K 5K 6K 7K totalsgroups 2 44 67 80 76 63 52 54 438frags - 5588 4254 3386 2413 1600 1100 979 19320frags used - 5469 4245 3364 2405 1581 1093 973 19130disk space 256K 5632K 8576K 10.0M 9728K 8064K 6656K 6912K 54.8Mspace used - 5469K 8490K 9.9M 9620K 7905K 6558K 6811K 53.7Mspace free 254K 119K 18K 66K 32K 95K 42K 42K 668Koverhead 2K 44K 67K 80K 76K 63K 52K 54K 438Kwasted - 0K 67K 80K 228K 126K 52K 54K 607K% used - 97% 98% 98% 98% 98% 98% 84% 98%

AdvFS On-Disk Structures 2-39

Page 104: Dunix Student

Assigning Fragments

This example shows a fragment free list.

Example 2-16: Fragment Free List Display Using nvfragpg

# nvfragpg -fr usr_domain usr ==========================================================================DOMAIN "usr_domain" --------------------------------------------------------------------------reading 438 frag group headers, 100 headers read reading 438 frag group headers, 200 headers read reading 438 frag group headers, 300 headers read reading 438 frag group headers, 400 headers readfrag type free 1K 2K 3K 4K 5K 6K 7K totalsgroups 2 44 67 80 76 63 52 54 438frags - 5588 4254 3386 2413 1600 1100 979 19320frags used - 5469 4245 3364 2405 1581 1093 973 19130disk space 256K 5632K 8576K 10.0M 9728K 8064K 6656K 6912K 54.8Mspace used - 5469K 8490K 9.9M 9620K 7905K 6558K 6811K 53.7Mspace free 254K 119K 18K 66K 32K 95K 42K 42K 668Koverhead 2K 44K 67K 80K 76K 63K 52K 54K 438Kwasted - 0K 67K 80K 228K 126K 52K 54K 607K% used - 97% 98% 98% 98% 98% 98% 84% 98%

head of free lists of frag groups from fileset attributes:frag type BF_FRAG_ANY firstFreeGrp 6976 lastFreeGrp 32frag type BF_FRAG_1K firstFreeGrp 6960 lastFreeGrp 560frag type BF_FRAG_2K firstFreeGrp 6880 lastFreeGrp 6880frag type BF_FRAG_3K firstFreeGrp 6944 lastFreeGrp 848frag type BF_FRAG_4K firstFreeGrp 6832 lastFreeGrp 816frag type BF_FRAG_5K firstFreeGrp 6896 lastFreeGrp 864frag type BF_FRAG_6K firstFreeGrp 6768 lastFreeGrp 6768frag type BF_FRAG_7K firstFreeGrp 6864 lastFreeGrp 6864

any 6976 6992free 1K 6960 6912 6928 560full 1K 0 112 128 176 224 320 400 496 944 1184 1616 1840 2928 3184 3280 3424 3568 3808 3856 3904 3936 3968 4192 4544 4816 4864 4912 4928 4944 5120 5344 5456 5536 5600 5760 6240 6528 6608 6640 6848free 2K 6880full 2K 64 208 256 272 288 336 480 624 784 928 1040 1168 1264 1520 1664 2480 2736 3024 3056 3072 3088 3104 3120 3200 3360 3504 3792 3824 3872 3888 3952 4032 4080 4144 4160 4288 4304 4352 4400 4448 4496 4528 4560 4592 4608 4624 4656 4688 4704 4896 4976 5168 5360 5488 5552 5584 5616 5664 5824 5984 6144 6320 6480 6560 6624 6656

(...)

free 7K 6864full 7K 80 384 512 544 656 704 768 832 912 1024 1328 1440 1568 1680 1792 2528 2608 2688 2800 2944 3168 3296 3408 3488 3536 3584 3600 3632 3664 3680 3712 4048 4224 4432 4784 5072 5184 5200 5232 5264

2-40 AdvFS On-Disk Structures

Page 105: Dunix Student

Assigning Fragments

file’s

opy

5472 5680 5792 5888 6000 6080 6128 6224 6336 6400 6448 6672 6816

Fragments and FilesAdvFS makes special arrangements to handle fragments of files. One of themcell records found in the BMT will have fragment information.

• POSIX stats record of mcells contains fragId field.

• fragId.frag is the page offset of fragment.

• fragId.type is the size of fragment.

• Use showfile to determine if there is a fragment.

Does number of pages match file size?

• Use nvbmtpg to find fragment location.

After you have found a fragment location, you can copy it using dd, with a command similar to:

dd if=/users/.tags/1 of=/tmp/frag.cpy bs=1024 iseek=717 count=2

The following example hunts down a fragment of a file. The test file ob_1 is a cof the /etc/disktab file.

Example 2-17: Tracking Down a Fragment

# ls -li ob_122894 -rwxr-xr-x 1 root system 31114 Jun 24 15:20 ob_1# # # nvbmtpg -r usr_domain usr 22894 -c ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 23552 BMT page 951--------------------------------------------------------------------------CELL 4 next mcell volume page cell 0 0 0 bfSetTag,tag 1,22894

RECORD 0 bCnt 92 BSR_ATTRtype BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTStype BSXMT_APPEND chain mcell volume page cell 0 0 0firstXtnt mcellCnt 1 xCnt 2bsXA[ 0] bsPage 0 vdBlk 572976 (0x8be30)bsXA[ 1] bsPage 3 vdBlk -1

RECORD 2 bCnt 92 BMTR_FS_STATst_mode 100755 (S_IFREG) st_uid 0 st_gid 0 st_size 31114st_nlink 1 dir_tag 22893 st_mtime Thu Jun 24 15:20:45 1999fragId.type BF_FRAG_7K fragId.frag 54990

#

AdvFS On-Disk Structures 2-41

Page 106: Dunix Student

Assigning Fragments

# dd if=/usr/.tags/1 of=/tmp/obfrag bs=1024 iseek=54990 count=22+0 records in2+0 records out# # # cat /tmp/obfrag:

ra71|RA71|DEC RA71 Winchester:\ :ty=winchester:dt=MSCP:ns#51:nt#14:nc#1915:\ :oa#0:pa#131072:ba#8192:fa#1024:\ :ob#131072:pb#262144:bb#8192:fb#1024:\ :oc#0:pc#1367310:bc#8192:fc#1024:\ :od#393216:pd#324698:bd#8192:fd#1024:\ :oe#717914:pe#324698:be#8192:fe#1024:\ :of#1042612:pf#324698:bf#8192:ff#1024:\ :og#393216:pg#819200:bg#8192:fg#1024:\ :oh#1212416:ph#154894:bh#8192:fh#1024:

(...)

The FAS layer uses fragments in the following way. When a write exceeds the fragment size, a page is allocated to the file and the fragment is copied to the new page and the fragment is deallocated.

2-42 AdvFS On-Disk Structures

Page 107: Dunix Student

Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile

Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile

Overview The storage bitmap file (SBM) represents the storage within an AdvFS volume. The miscellaneous bitfile represents a fake superblock and other disk overhead structures typically found in a volume.

• Storage bitmap file characteristics

• SBM format

• Miscellaneous bitfile

Storage Bitmap Bitfile CharacteristicsHow does AdvFS know which disk blocks are free and which are in use? The storage bitmap file:

• Represents each 8K of on-disk storage within a volume with 1 bit (1K per bit in DIGITAL UNIX V4.0)

• Is basically a group of bits representing whether or not storage is in use.

• Is .tags/M-7

Characteristics of the SBM bitfile include:

• One per volume

• All storage is either free or in a bitfile

• AdvFS storage is allocated in clusters

1 cluster == 2 sectors == 1024 bytes

• On-disk SBM is little more than an array of bits (1 bit per page)

Each AdvFS volume contains a storage bitmap which keeps track of allocated disk space. In AdvFS terminology, a block is a 512-byte sector, a cluster is one more contiguous block, and a page is 16 blocks. Each bit in the storage bitmap represents a page. If the bit is set, the page is allocated to a bitfile; if the bit is clear, the page is free (available for allocation). The cluster size is definable on a volume basis, however AdvFS currently uses a cluster size of two blocks (1K byte) for all volumes. The bigger the page size, the smaller the bitmap.

The storage bitmap is structured as an array of 8KB pages where each page consists of an array of 32-bit integers (each bit represents a page). Each page also contains a header containing an XOR checksum of the integer array.

AdvFS On-Disk Structures 2-43

Page 108: Dunix Student

Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile

SBM FormatSBM page format consists of:

• SBM page header

— Sequence number, now unused (32 bits)

— XOR field for rest of page (32 bits)

• Bitmap

— 65472 bits

— Enough for 8184 pages (8K each)

The following example shows access to SBM through .tags.

Example 2-18: SBM Display Through .tags/M-7

# od -x -N 1000000 /var/.tags/M-70000000 0000 0000 ffff 00ff ffff ffff ffff ffff0000020 ffff ffff ffff ffff ffff ffff ffff ffff*0001220 ffff 00ff 0000 0000 0000 0000 0000 00000001240 0000 0000 0000 0000 0000 0000 0000 0000*0020000 0000 0000 0000 ffff ffff ffff ffff ffff0020020 ffff ffff ffff 00ff ffff ffff ffff ffff<== Big bitmap.0020040 ffff ffff ffff ffff ffff ffff ffff ffff*0020100 ffff ffff ffff ffff 0000 ff00 ffff ffff0020120 ffff 00ff ffff ffff ffff ffff ffff ffff0020140 ffff ffff ffff ffff ffff ffff ffff ffff*

(...)

0040000 0000 0000 0000 ff00 0000 0000 0000 00000040020 0000 0000 0000 0000 0000 0000 0000 0000*0040500 0000 0000 0000 0000 0000 ff00 0000 00000040520 0000 0000 0000 0000 0000 0000 0000 0000*0100000# # # showfile -x /var/.tags/M-7

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf Filefffffff9.0000 1 16 4 simple ** ** ftx 100% M-7

extentMap: 1 pageOff pageCnt vol volBlock blockCnt 0 4 1 112 64 extentCnt: 1#

2-44 AdvFS On-Disk Structures

Page 109: Dunix Student

Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile

# # ls -li /var/.tags/M-74294967289 ---------- 0 root system 24576 Dec 31 1969 /var/.tags/M-7#

To see if a particular page, say 40000, is free:

1. Divide 40000 by 8184.

Quotient = 4, page number within SBM

Remainder = 7264, byte offset into page 4 bitmap array

2. Byte offset into SBM for page 40000 is 4 * 8192 + 7264 + 8 or 40040 ( 40000 + (4+1)*8, is easier).

3. Read byte with the od command:

od -x -j 40040 -N 1 /usr/.tags/M-7

This example shows information on the allocation status of page 40000.

Example 2-19: SBM Page Allocation Status Display Using vsbmpg

# vsbmpg -r usr_domain 1 -B 40000 ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 112 SBM page 0--------------------------------------------------------------------------block 40000 (0x9c40) is in sbm index 625 mapInt[625] 00000000 00000000 00000000 00000000 block 40000 ^#

This example shows summary SBM information displayed by vsbmpg.

Example 2-20: SBM Summary Information Display

# vsbmpg -r usr_domain ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) --------------------------------------------------------------------------There are 3 pages in the SBM on this volume.The volume has 2532999 blocks (158312 pages).11480 pages (7%) are used.

==========================================================================DOMAIN "usr_domain" VDI 2 (/dev/rdisk/dsk4b) --------------------------------------------------------------------------There are 1 pages in the SBM on this volume.The volume has 262144 blocks (16384 pages).236 pages (1%) are used.

AdvFS On-Disk Structures 2-45

Page 110: Dunix Student

Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile

This example shows all SBM pages for a volume.

Example 2-21: SBM Pages Displayed Using vsbmpg

# vsbmpg -r usr_domain 1 -a ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 112 SBM page 0--------------------------------------------------------------------------lgSqNm 0 xor 6aa2212b index block mapInt[] 0 0 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 4 100 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 8 200 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 12 300 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 16 400 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 20 500 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 24 600 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 28 700 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 32 800 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 36 900 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 40 a00 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 44 b00 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 48 c00 000003ff 00000000 00000000 00000000 *. 52 d00 00000000 00000000 00000000 00000000 56 e00 00000000 00000000 00000000 00000000 60 f00 00000000 00000000 00000000 00000000 64 1000 00000000 00000000 00000000 00000000 68 1100 00000000 00000000 00000000 00000000 72 1200 00000000 00000000 00000000 00000000 76 1300 00000000 00000000 00000000 00000000 80 1400 00000000 00000000 00000000 00000000 84 1500 00000000 00000000 00000000 00000000 88 1600

(...) 1028 10100 00000000 00000000 00000000 00000000 1032 10200 00000000 00000000 00000000 00000000 1036 10300 00000000 ff000000 ffffffff ffffffff . .... .... 1040 10400 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 1044 10500 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 1048 10600

(...) 2036 1fd00 00000000 00000000 00000000 00000000 2040 1fe00 00000000 00000000 00000000 00000000 2044 1ff00 00000000 00000000 00000000 00000000 ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 128 SBM page 1--------------------------------------------------------------------------lgSqNm 0 xor ffda9d37 index block mapInt[] 2046 1ff80 2b7ff3ff ffffe3ff ffffffff fffbf7ff ***. ..*. .... .**. 2050 20080 e7f7ffff 0c0070ff 442fff00 c0000002 **.. * *. **. * * 2054 20180 0bfdfffc 1c07ffe0 ffbd60ff 3e11ffff **.* **.* .**. **..

(...)#

2-46 AdvFS On-Disk Structures

Page 111: Dunix Student

Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile

Miscellaneous BitfileCharacteristics of the miscellaneous bitfile include:

• One per volume

• Holds pages for:

— Primary and secondary boot block

— Partition table (disk label)

— Fake UFS super block with AdvFS magic number

AdvFS On-Disk Structures 2-47

Page 112: Dunix Student

Summary

s and

f

e

ll can and

Summary

Introducing AdvFS On-Disk Structures AdvFS is built using a two-layer strategy separating file access support from file storage support. The two layers are:

• File access system (FAS)

• Bitfile access system (BAS)

Consider the .tags directory as a way to access the BAS from the context of the FAS. All files and metadata are accessible through .tags by using the file’s tag number or a special metadata file name.

All AdvFS on-disk structures can be accessed as bitfiles. This includes user filedirectories as well as the AdvFS metadata structures.

Bitfiles are arrays of 8K disk pages holding user data or metadata. A series ocontiguous 8K pages in a bitfile is stored as an extent

Each bitfile is identified by its tag, which consists of a tag number or sequencnumber pair. Use the tag number to locate the extents of a file.

Several metadata bitfiles (RBMT, BMT) have an internal page organization consisting of a page header and a series of mcell data structures. Each mcecontain a series of variable-length records describing various bitfile attributescharacteristics.

AdvFS files have means of identification: a tag

• Similar to the UFS inode number

• Can be discovered with the ls -i command

Bitfile-sets have the following characteristics:

• FAS fileset represents a BAS bitfile-set

• Identified by numbers

Tag numbers can be reused:

• With file creation and deletion.

• Like inode numbers.

2-48 AdvFS On-Disk Structures

Page 113: Dunix Student

Summary

ells extent

Describing BAS On-Disk Metadata BitfilesEach volume in an AdvFS domain consists of the following structures:

• Reserved bitfile metadata table

• Bitfile metadata table

• Storage bitmap

• Miscellaneous bitfile

Each domain is supported by the following bitfiles:

• The on-disk log

• The root tag file

Each volume is supported by the following bitfiles:

• Reserved bitfile metadata table (RBMT)

• Bitfile metadata table (BMT)

Bitfile metadata table (BMT) holds the support data for user files and directories. It contains location information, permissions and other stats, extent information, fragment location, and other descriptive data.

The BMT stores bitfile metadata, including:

• Bitfile attributes

• Bitfile extent maps

• Bitfile set attributes

• FAS file attributes including the POSIX file stats

These characteristics describe mcells and the records within them.

• Inodes of AdvFS are 28 fixed-size (292 byte) mcells packed into 8K pages

• One or more linked mcells describe bitfiles

• First mcell in list is primary mcell

• Each mcell contains variably sized records describing attributes of the bitfile

Using Extent MapsExtent maps are stored in mcells linked to a bitfile’s primary mcell. Since mcare of limited size, an extent map may span several mcells (the pieces of the map in different mcells are called subextent maps).

AdvFS On-Disk Structures 2-49

Page 114: Dunix Student

Summary

Using TagsTag files translate a bitfile tag to the location of its primary mcell. The file is indexed by tag number and the file entry contains the logical address of the primary mcell which is a tuple of the following format:

<volume index, BMT page number, mcell’s index within the BMT page>

A bitfile tag consists of a tag number and a sequence number. Whenever a bitfile is deleted, its tag is placed back on the free list. However, for various consistency reasons (like crash recovery) AdvFS cannot reuse the tag unless it is made unique from previous uses of that tag. So, each time a tag is reused, its sequence number is incremented to differentiate it from the previous use of the same tag. The sequence number has a limited number of bits, so a tag can be used 4096 times and then it becomes a dead tag, never to be reused again.

Assigning FragmentsStorage allocation in AdvFS is done in 8KB page units. For small files this can cause internal fragmentation. To solve this problem, AdvFS uses storage fragments that are 1KB to 7KB in size to store small files and the ends of files less than 100KB in size.

Fragments are allocated from the fragment bitfile, which is a metadata bitfile associated with each bitfile-set (it is always assigned tag 1). The basic structure of the fragment bitfile is a collection of fragment groups where each group contains a header and an array of fragments of a uniform size.

Defining the Storage Bitmap Bitfile and Miscellaneous BitfileEach AdvFS volume contains a storage bitmap that keeps track of and allocates disk space. In AdvFS terminology, a block is a 512-byte sector, a cluster is one more contiguous block, and a page is 16 blocks. Each bit in the storage bitmap represents a page. If the bit is set, the page is allocated to a bitfile; if it is clear, the page is free (available for allocation). The cluster size is definable on a volume basis, however AdvFS currently uses a cluster size of two blocks (1K byte) for all volumes.

The storage bitmap is structured as an array of 8KB pages where each page consists of an array of 32-bit integers.

2-50 AdvFS On-Disk Structures

Page 115: Dunix Student

Exercises

Exercises

The exercises in this chapter are preceded by a refresher or primer section. Please read the information carefully. It serves not only as a reminder of lecture information, it sometimes introduces new points.

Bitfiles and Tags Lab Refresher

The lowest level of the AdvFS is the bitfile access system. Here every file is a bitfile, a collection of 8192-byte pages. The higher level of AdvFS, or file access system, transforms bitfiles into normal UNIX files.

When tags are used for the first time, they are given a sequence number of 8001 hexadecimal or 19647 decimal. You will notice from the output of showfile that sequence numbers rarely get much greater than the initial 8001 value.

AdvFS file systems are also identified by tags. If you use the showfsets command on a file domain, you will see that the ID of a fileset is a sequence of four hexadecimal numbers, such as 319b7053.00092e01.2.8002.

The first two hexadecimal numbers, in this case, 319b7053.00092e01, identify the file domain. The last two hexadecimal numbers, here 2.8002, are tags that identify the fileset.

Every AdvFS file system has a .tags subdirectory that allows direct access, for the superuser, to bitfiles by tag number. A file in the /users file system with tag 1bb88.803a can be addressed using .tags through a wide variety of names including:

/users/.tags/0x1bb88 /users/.tags/113544/users/.tags/0x1bb88.0x803a

Exercise

1. Use the showfile and the ls -i commands to list the tag numbers of a few AdvFS files and then access the files through the appropriate .tags directory.

2. Use the tag2name command, located in /sbin/advfs, to translate an AdvFS tag into the corresponding file name.

BMT and RBMT Lab Refresher

The BMT is a bitfile that contains metadata call (mcell) records. Every virtual disk of the file has a BMT. The mcell records of the BMT contain almost all information that describes the files of the virtual disk plus additional information about the file domain and filesets. The RBMT contains the mcells for the BMT and other reserved bitfiles.

AdvFS On-Disk Structures 2-51

Page 116: Dunix Student

Exercises

One important use of the BMT is the storage of extent records that describe the pages used by all bitfiles of the disk.

Each 8192-byte page of the BMT is comprised of a 16-byte header followed by an array of twenty eight 292-byte mcells.

The BMT header starts with three fields used to track free mcells. It then contains a field giving the page’s number within the BMT bitfile and the version number of the advanced file system being used in this file domain. The present version number is 4.

Mcell Format Lab Refresher

Every bitfile has a primary mcell. The primary mcell is the beginning of a chain of mcells which describe the bitfile. Mcells are addressed by a 32-bit mcell ID, bfMCIdT, in which the first 27 bits give the mcell’s BMT page number and the remaining 5 bits give the mcell’s position within its BMT page. Tag files, described in another section, translate a bitfile’s tag number into a 16-bit disk index and a 32-bit mcell ID which points to the bitfile’s primary mcell.

The 292-byte mcell begins with three fields used to link the chain of mcells associated with a particular bitfile. The first field is the 32-bit ID of the next mcell in the chain and the second field is the disk containing that mcell. Some bitfiles, in particular striped files, will have mcells located on several disks. The third field gives the position of this mcell within the chain. A disk and mcell ID of zero, indicates the end of the mcell chain.

The remaining two header fields are a pair of tags that uniquely identify the bitfile. The second tag of the page names the bitfile set, or file system, in which the bitfile is contained. The first tag is the tag of the bitfile itself.

After these five header fields, 268 bytes remain within the mcell. These 268 bytes contain mcell records.

Reserved Mcells Lab Refresher

The primary mcells of certain important bitfiles must be located at fixed positions within the RBMT. Within RBMT page 0, there are seven reserved positions.

0 The RBMT itself

1 Storage bitmap; keeps up with free space on the disk

2 Root tag file; keeps up with fileset tag files

3 Transaction log bitfile; contains the AdvFS log

4 BMT; one of a chain of mcells associated with the BMT

5 Miscellaneous bitfile; contains the boot blocks and partition table

6 Information on volume and domain

2-52 AdvFS On-Disk Structures

Page 117: Dunix Student

Exercises

lly in only

S. her-here

d.

of

ur y

d ul.

.

n in es to

BMT page 0 has a reserved mcell at slot 0. This mcell is the head of the BMT’s list of free mcells. Every disk must contain a BMT, a storage bitmap, and a miscellaneous bitfile; however, only one disk of the file domain needs a root tag file and a transaction log bitfile.

The reserved bitfiles do have special tags. Take the index of the virtual disk on which the reserved bitfile is located, multiply that number by -6, and then subtract the mcell position found in the above table. For example, the storage bitmap of disk two has tag -13, 2*(-6)-1, and a transaction log on disk seven has tag -51, 7*(-6)-9. You can use these tags to access reserved bitfiles via the .tags directory. /usr/.tags/-22 (or M-22) would be the BMT of the second virtual disk of the file domain that contains /usr.

Mcell Records Lab Refresher

Mcell records vary in size and type. Every mcell record begins with a 4-byte header which gives the record’s size and type. Records are simply stored sequentiathe mcell. Some records are so large that they fill the entire mcell. Others are8 bytes long, including header.

The lower-numbered record types are associated with the BAS layer of AdvFThese describe bitfile information such as extents and fileset names. The hignumbered record types are associated with the FAS layer of AdvFS. This is winformation such as file modification times and symbolic link names are store

Exercise

Within the /sbin/advfs directory is the nvbmtpg program which prints out BMT records. Read the reference pages for nvbmtpg.

1. Either use the showfdmn program or generate a recursive directory listing /etc/fdmns to determine the name of a virtual disk (SCSI or LSM blockdevice) of an AdvFS file domain.

2. Use the nvbmtpg command to look at the first two pages of the RBMT on yovirtual disk. Save this information in a file. If a printer is available, you mawant to print out this information.

3. Use the nvbmtpg -c command to look at the mcell chains for the BMT anstorage bitmap. Write down the extent map of the BMT. You will find it usef

4. Use the showfile -x command with the BMT’s .tags directory entry as an argument to verify that you successfully completed the last exercise.

5. Try using nvbmtpg with the BMT's .tags directory entry as an argument. You must have a fileset of the file domain successfully mounted to do this

6. Verify that the data of your chosen files really is stored in the pages showthe extent map by reading the data through the raw device that AdvFS usstore the data. Here is one example of someone doing this exercise:

AdvFS On-Disk Structures 2-53

Page 118: Dunix Student

Exercises

$ showfdmn -k play_dmn

Id Date Created LogPgs Domain Name3269046b.000bc403 Sat Oct 19 12:40:11 1996 512 play_dmn

Vol 1K-Blks Free % Used Cmode Rblks Wblks Vol Name 1 1067152 402864 62% on 128 128 /dev/rz3c 2L 1055509 387696 63% on 128 128 /dev/rz2c ---------- ---------- ------ 2122661 790560 63%

$ showfile -x bmtmisc.c

Id Vol PgSz Pages XtntType Segs SegSz Log Perf File bf48.8002 1 16 1 simple ** ** off 100% bmtmisc.c

extentMap: 1 pageOff pageCnt vol volBlock blockCnt 0 1 1 1601392 16 extentCnt: 1

# dd if=/dev/rrz3c of=temp ibs=512 iseek=1601392 count=16dd: 16+0 records in.dd: 16+0 records out.

7. Now create a striped file, at least 1 megabyte in size, and use nvbmtpg -c, and showfile to find its extent map. You must look into the BMTs of at least two virtual disks to obtain this information.

8. Create a file with a few large holes and then look up its extent map. An easy way to create a file with holes is:

# dd if=/vmunix of=holesome.dat oseek=500 count=100# dd if=/vmunix of=holesome.dat oseek=1000 count=100

9. Create a clone fileset. Use ls -i to verify that the tag numbers of the files in both the original and clone filesets are the same. Use nvbmtpg to verify that even the mcell IDs are unchanged.

10. Modify a large file in the original fileset by appending some blocks to the end of the file. (#cat some_file >> large_file) (or use dd with the oseek and count options). Now use ls -i and nvbmtpg to see that, while the tags are unchanged, the mcell IDs are now changed. Use showfile -x and nvbmtpg -c to look at the extent maps of both original and clone.

11. Use nvbmtpg -c to look at mcell number 0 on page 0 of the BMT for the first volume of a domain. Note that the mcell free list is minimal. This is part of the dynamics built into AdvFS for Tru64 UNIX V5.

12. Look at bitfile attributes and bitfile inheritable attributes for reserved files, nonreserved regular files, and finally nonreserved directories.

2-54 AdvFS On-Disk Structures

Page 119: Dunix Student

Exercises

13. Look at bitfile attributes for a clone file with a modified original and then look at the bitfile attributes for the original.

14. Print out the mcell records for an original and a cloned fileset.

15. This exercise is difficult. Try deleting a fileset with lots of large files. While the deletion is in progress, examine the fileset mcell record to look at the progress of the deletion through the delete pending change.

16. Convince yourself that executing the chfsets command really does result in modification of the appropriate fileset attributes record.

17. Examine the domain attribute and virtual disk records for your AdvFS disks.

POSIX File Information Lab Refresher

BMT record type 255 is file system stats BMTR_FS_STAT.

Every nonreserved file has the standard UNIX file characteristics, such as file permissions, file owner, file group, access time, and file size. These are stored in a single BMT record.

This is very similar to the information stored in an UFS inode; however, there are three additional items, a tag pointing back to the parent directory and two fields used to record fragment identification for small UNIX files.

struct fs_stat{ bfTagT st_ino; mode_t st_mode; uid_t st_uid; gid_t st_gid; dev_t st_rdev; off_t st_size; time_t st_atime; int st_uatime; time_t st_mtime; int st_umtime; time_t st_ctime; int st_uctime; uint_t st_flags; /* user defined flags for file */ bfTagT dir_tag; /* tag of parent directory */ bfFragIdT fragId; short st_nlink; short st_unused_1; /* pads out the 16-bit nlink field */ uint32T fragPageOffset; uint32T st_unused_2;};

The definition of this record is found in msfs/fs_dir.h .

BMT record type 254 Fast symbolic links BMTR_FS_DATA

AdvFS On-Disk Structures 2-55

Page 120: Dunix Student

Exercises

If an mcell corresponds to a symbolic link, the mode field of the POSIX file stat record is marked. If the name of the symbolic link target will fit into a BMT record, a special BMT record is created which contains the target name as its data value. Longer symbolic link names, which are very rare, are stored as file data.

1. Examine the BMT POSIX file stat records for a few of your files. See how the information stored in this record is reflected in the output of ls -l.

2. Now create some symbolic links and examine the corresponding BMT fast symbolic link records.

Trashcan Directories Lab Refresher

BMT record type 252.

1. Undelete directory BMTR_FS_UNDEL_DIR.

Trashcan directories have a simple, on-disk implementation. All that is created is a BMT record within the mcell chain of the "source" directory containing the tag of the trashcan. Since the BMT records of files point to their parent directories, it is not difficult to determine the appropriate trashcan for a file.

struct undel_dir_rec { bfTagT dir_tag;};

2. Use mktrashcan to create a trashcan directory and then examine the BMT record that points to the trash.

Time Lab Refresher

Keeping good time is supported by a BMT mcell record.

BMT record type 251 is file system time BMTR_FS_TIME.

This 4-byte record contains a value stored in UNIX standard time. It is found within the mcell chains for the root directories of file systems. For nonroot file systems, it is typically updated on file system updates. For the root file system, it is updated at regular file system synchronization time. When the system is rebooted, this BMT record is read from the root file system and compared with the time-of-day clock as a sanity check.

Find the mcell ID of a file system root and examine its BMT file system time records.

Tag File Lab Refresher

Like all bitfiles, the tag file is composed of 8192-byte pages. Each page consists of a 16-byte header followed by 1022, 8-byte tagmap entries.

2-56 AdvFS On-Disk Structures

Page 121: Dunix Student

Exercises

The 16 bytes of the header give the logical address of the page within the tag file, pointers to free tagmap entries and to pages with free entries, a count of allocated tagmap entries, and a count of dead tagmap entries.

There are actually three record formats for the 8-byte tagmap entries. The first entry on page 0 of the tag file isn’t really used to map tags to mcell IDs. Instead it contains the page addresses of the first tag file page with a free entry and the first tag file page that has not yet been initialized.

The second format for tagmap entries links the free entries of a page.Note that even though the entry is free, the sequence number is still maintained. The final format contains a real tag to mcell ID mapping:

Because tags can be reused when files are deleted and have their tags reclaimed, sequence numbers distinguish reincarnations of the tag. If a tag is in use, the first bit of the 16-bit sequence number is on. This is why the sequence numbers printed by showfile usually start with the hexadecimal digits 80. This leaves 15 bits for recording the real sequence number.

Given a tag number, it is not hard to find the appropriate tagmap entry. Divide the tag number by 1022 to find the appropriate tag file page. Use the remainder of that division to find the tagmap entry within the page. Just multiply that remainder by eight and add in 16 for the page header.

Two Types of Tag Files Lab Refresher

Every AdvFS file domain has a single root tag file that gives the location of the mcells associated with the filesets of the domain. The primary mcell of the root tag file is located in position 2 of page 0 of the RBMT. Only one virtual disk of the domain contains the true root tag file. The domain mutable attributes record of the RBMT fingers the real one. The root tag file is usually found at block 96 on the virtual disk, but an aggressive use of addvol and rmvol may cause it to move to another location.

Every AdvFS fileset has its own tag file. The fileset ID, found in the bitfile-set attributes record, is the tag for the fileset. Use this number as an index into the root tag file to find the mcell ID for the fileset itself. The extent map found in the fileset’s mcell chain gives the pages of the fileset’s own tag file. In general, the search goes in the other direction; the root tag file is searched to locate a particular fileset.

For convenience, the .tags directory has its own special naming convention for fileset tag files. Putting the letter M before the fileset ID gives the name of the tag file as found in the .tags directory. For example, if the mounted file system /playpen has fileset ID 5, /playpen/.tags/M5 is the name of /playpen’s tagfile.

AdvFS On-Disk Structures 2-57

Page 122: Dunix Student

Exercises

Exercise

1. Start by running showfsets on your AdvFS file domains so that you will know a few fileset IDs to use in the remaining exercises.

2. Use the nvtagpg program, located in /sbin/advfs, to list the root tag files of an AdvFS file domain.

3. Select a target file and use both showfile and ls -i to obtain its tag number. The reason for using two programs is that one prints the tag number in decimal and other prints the sequence number.

4. Divide the tag number by 1022 and write down both the quotient and remainder. The quotient determines the page number containing the appropriate tagmap entry while the remainder determines the position within the page. (The previous calculation was more useful in V4 than in V5 of Tru64 UNIX.) Now use nvtagpg to get the tagfile page entry.

If the sequence numbers of nvtagpg and showfile do no’t match, you do not have the right tagmap entry. You may need to convert from hexadecimal to decimal to verify the match.

5. Use the showfile -x command on the .tags M file for a fileset to determine the extent map of a fileset’s tag file.

Directory Lab Refresher

Now look at the higher-level POSIX directories. AdvFS is designed to use almost the same format for directory files as UFS. The only difference is that AdvFS uses the padding of the directory entry to store the 8-byte file tag. One advantage of this is that those unreformed programs that read UFS directory files, rather than use the getdirentries(2) system call, to determine the files of a directory, will work equally well, or poorly, on UFS and AdvFS file systems.

AdvFS has added some additional support for very large directories. The performance improvements include the creation of a B-tree index supporting directories that are more than 8K in size. This dramatically improves file creation and deletion performance. Improvement becomes more noticeable when the directory contains more than ~2500 files.

Each 8192-byte directory page ends with a 12-byte directory record that has two fields for tracking free directory entries and one field for the page type, presently always a one-meaning sequential directory.

The remaining space within the directory page is occupied by directory entries. In fact, the entire directory page is occupied by directory entries, because the 12-byte directory record is stored within the padding of a directory entry.

2-58 AdvFS On-Disk Structures

Page 123: Dunix Student

Exercises

Every file within a directory has its own directory entry. The AdvFS directory entry has five fields and two places where padding may be inserted. The first three fields are considered the directory header. The first field, 4 bytes in size, gives the tag number of the file. In a UFS directory entry, this is where the inode number is stored. The next two fields, each 2 bytes long, give the size of the directory entry and the size of the file name.

The fourth field of the directory entry is the file name.

In AdvFS there is a fifth field, an 8-byte tag consisting of tag number and sequence. Up to 4 bytes of padding may precede the tag to ensure that the tag is stored on a 4-byte boundary. More padding may follow the tag to fill out the entire directory entry.

The reason why there may be up to 4 bytes of padding before the tag is that the file name field is always followed by at least one null character.

Directory entries should never cross sector, 512-byte, boundaries. For this reason, you often see that directory entries near the end of a sector will have a generous allotment of padding. This enables them to fill the sector.

A directory entry with zero in its first 4 bytes, the location of the tag number in AdvFS and the inode number in UFS, is considered empty. When a directory is created, it is given two directory entries for . and .. which are placed at the beginning of the directory file. The remainder of the directory file is filled with empty sector-sized directory entries.

Exercise

1. Create a directory and connect into it. Now look at the directory file by typing the following commands:

# vfilepg -r domain_name fileset_name directory/spec -f d

# od -A x -a -h -H . | more

Notice the entries for . and .. along with all the empty directory entries.

2. Use the touch command to create five files, i, ii, iii, iv, and v, within your new directory. Use od and vfilepg to examine the directory file.

3. Remove the file iii. Use od to determine what happens to the directory entry for iii.

4. Now remove ii. Notice how the old directory entries for ii and iii have been merged.

5. You may have noticed that the tags for ii and iii continue to reside in the directory file and may be wondering about the possibilities for file undeletion. Return to reality by remembering what happens to free tagmap entries.

AdvFS On-Disk Structures 2-59

Page 124: Dunix Student

Exercises

6. Create, via touch, a file vii and notice where its directory entry is placed.

7. Create several files with very, very long names and see how the creation of directory entries avoids crossing the sector boundaries.

8. Do a showfile -i command on an 8K directory file. Make a larger directory file. What does showfile -i indicate on the larger directory?

Fragment File Lab Refresher

The bitfile access system of AdvFS only allocates disk space in 8192-byte pages. Since there are some rather clear inefficiencies to storing 117-byte or even 9000- byte files in 8192-byte pages, AdvFS stores some files in an integral number (possibly zero) of 8192-byte pages followed by a fragment of 1024 to 7168 bytes.

To convince yourself that something unusual really is happening, execute the following commands:

$ dd if=/etc/disktab of=frag.file$ ls -l frag.file$ showfile -x frag.file

You have already encountered mention of fragments in two BMT records: the bitfile-set attributes record contained an array of eight fragment group headers, and the POSIX file stats record contained a fragment ID and page offset.

If you were paying careful attention to the output of nvtagpg, you may have noticed something else unusual; every fileset has a bitfile with tag number 1. The number one bitfile of a fileset is the fragment bitfile. If an FAS-level or POSIX-level file has its final bytes stored in a fragment, those bytes are stored inside the fragment bitfile.

Finding a File’s Fragment

The fragID field of the POSIX file stats record is used to record the position of a file’s fragment. This field is actually a two-part record: fragId.frag is the offset, in 1024-byte blocks, of the file’s fragment within the fileset’s fragment bitfile, and fragId.type is the number of 1024-byte blocks allocated to this fragment.

Two other fields of the file stats records are also involved with fragment management. The fragPageOffset field records the logical page address of the fragment within the fragment bitfile. Unless the file has holes, this will equal the number of full pages allocated to the file. One other field, st_size, is needed to determine just exactly how many bytes of the fragment are used by the file. Take the remainder of dividing this field by 8192, and you have the number of bytes of the fragment that contain real file data.

If the fragId.type field is zero, the file has no fragments.

2-60 AdvFS On-Disk Structures

Page 125: Dunix Student

Exercises

Exercise

1. Execute the ls -l command on some fragment bitfiles. Remember that .tags/1 gets you to a fileset’s fragment bitfile.

2. Copy some randomly sized file, say /etc/disktab, onto your AdvFS file system. Now use nvbmtpg to find out where your file’s fragment resides within the fragment bitfile.

3. Now use dd to copy that fragment directly out of the fragment bitfile. You’ll use a command similar to:

# dd if=/playpen/.tags/1 of=/tmp/copy ibs=1024 iseek=76275 count=3

Managing the Fragment Bitfile

Concentrate on the on-disk structure of the fragment bitfile. Of course, the fragment file is comprised of 8192-byte pages. However, these pages are managed as a collection of 16 pages, or 128 kilobyte, fragment groups. Within a group, all allocated fragments are of the same size, that is, the same number of 1024-byte blocks. Consequently, fragment groups fall into one of eight different types; one type for each of the seven fragment sizes, and an eighth type for fragment groups which have no allocated fragments.

Fragment Group Header

Each fragment group begins with a header. The first two fields of the header are not used in the current version of AdvFS. The third field is used to link together all the fragment groups of the same type. The fourth field is the page number of the fragment group header and a good place to verify that you are looking at a group header page. A type indicator, count of free fragments, and the fileset ID occupy the next three fields. The seventh field is the version number of the fragment implementation. Presently, we are on version 1. The eighth field is the address, in 1024-byte blocks relative to the beginning of the fragment bitfile, of the first fragment within this group. The remaining fields are used as a free list for the group’s fragments.

Recall that all the fragments within a group are of the same size. This is even the case for the fragment number zero which contains the fragment group header followed by a lot of unused space. The free list is an array which has one element per fragment. Each element of the free list which corresponds to a free fragment points to the next free fragment. Array element 0 of the free list is either -1, which indicates that the group has no free fragment, or the address of the first free fragment. To find the remaining free fragments, work your way through the free list until you encounter a -1. MS-DOS gurus should think about the file access table.

AdvFS On-Disk Structures 2-61

Page 126: Dunix Student

Exercises

Here’s the definition of the group header taken from the kernel source file msfs/msfs/bs_bitfile_sets.h:

typedef struct grpHdr { uint32T nextFreeFrag; /* frag index (valid only when "version == 0") */ uint32T lastFreeFrag; /* frag index (valid only when "version == 0") */ uint32T nextFreeGrp; /* page number */ uint32T self; /* this group’s starting page number */ bfFragT fragType; /* type of frags in this group */ int freeFrags; /* number of free frags in the group */ bfSetIdT setId; /* bitfile-set’s ID */ /* * the following fields were added in ADVFS v3.0 * they were all zeros in pre-ADVFS v3.0 */ unsigned int version; /* metadata version pre-ADVFS v1.0 == 0, ADVFS v3.0 == 1 */ uint32T firstFrag; /* frag index */ /* * the following is used a map of the free frags in the group. * it a linked list where element zero (0) is used as the head * of the list (since frag 0 is always the group header it can * never be allocated so element zero would otherwise be unused) */ unsigned short freeList[ BF_FRAG_GRP_SLOTS ];} grpHdrT;

Finding Fragment Group Headers

The remaining mystery is the location of the group headers themselves. The addresses, in 8192-byte pages, of the first and last group headers for each of the eight fragment types are found in the last field, fragGrps of the bitfile-set attributes record for the fileset. This record is found in the BMT chain for the fileset’s tagfile.

Exercise

1. The program nvfragpg, found in /sbin/advfs, prints various interesting statistics about fragment usage within the eight different fragment groups. Read the reference page for this command and then apply it to each of your AdvFS filesets.

2. Use the nvtagpg command to find the mcell IDs of some AdvFS bitfile-sets. Now use nvfragpg to print out the addresses of the fragment group headers for these filesets.

3. The nvfragpg program, also located in /sbin/advfs, will print out a list of the free fragments found within a fragment group along with the address of the next group of that type.

2-62 AdvFS On-Disk Structures

Page 127: Dunix Student

Exercises

e each

tag e ze of

ernel is

nd

You ill

vious tual

SBM Lab Refresher

Every AdvFS virtual disk has a storage bitmap bitfile which tracks free and used disk blocks. There is little more than an array of bits, one for each 8K bytes of disk storage.

SBM Page Format

Each SBM page begins with a two-field header. The first field once stored a log sequence number but is presently always set to zero. The second field is a 32-bit exclusive, or parity, of all the remaining 32-bit words in the SBM page. The remaining 2046 words are a bitmap for 65,472 8K pages of the virtual disk. If a cluster is allocated, its bit is set. If a cluster is free, its bit is clear. Since AdvFS always allocates disk storage in pages rather the clusters, it’s easier to imaginpage of the SBM as having 8184 bytes corresponding to pages.

The algorithm for determining if a page of the disk is free is simple. Take thenumber and divide by 8184. The quotient points you to an SBM page, and thremainder points you to a byte within that page after you add in eight for the sithe header.

Obviously, using this sort of representation to manage free storage inside the kwould be very inefficient. We'll soon see the kernel's in-core structure, whichvery different from the on-disk structure.

Exercise

1. Start out by running od -x on one of your storage bitmap files. The commasyntax will be something like:

# od -x -N 1024 /usr/.tags/-7

2. Repeat the previous exercise, but this time use the virtual disk interface. must use showfile -x to find the extent map for the storage bitmap. It wlook similar to:

# od -x -j 112b -N 1024 /dev/disk/dsk3c

3. Here is how to determine if page 17000 of an AdvFS virtual disk is free:

# expr 17000 / 8184 \* 8192 + 17000 % 8184 + 8

17024

# od -x -j 17024 -N 1 /usr/.tags/-7

4. Tru64 UNIX V5 supplies a much more convenient command, vsbmpg. See the reference page for this command. Accomplish the same result as the preexercise without all the arithmetic. Find out if page 50000 of one of your virdisks is free.

AdvFS On-Disk Structures 2-63

Page 128: Dunix Student

Exercises

Miscellaneous Bitfile Lab Refresher

Every page on an AdvFS virtual disk is either free or assigned to a bitfile. To satisfy this requirement, several data blocks are put into special miscellaneous bitfiles. These data blocks consist of the disk pages containing the partition table, the primary and secondary boot blocks, and the AdvFS magic number.

Exercise

Use showfile to see the extents of your miscellaneous bitfile. Find them at .tags/-11, and so forth.

2-64 AdvFS On-Disk Structures

Page 129: Dunix Student

Solutions

Solutions

1. Use the showfile and the ls -i commands to list the tag numbers of a few AdvFS files and then access the files through the appropriate .tags directory.

#

# df -t advfs

Filesystem 512-blocks Used Available Capacity Mounted on

usr_domain#usr 1426112 1025668 294688 78% /usr

usr_domain#var 1426112 75822 294688 21% /var

bruden_dom#bruce_fset 50000 50000 0 100% /usr/bruce

bruden_dom#dennis_fset 2251840 91118 2082800 5% /usr/dennis

#

# cd /usr/dennis

#

# ls -li

total 45567

3 drwx------ 2 root system 8192 Sep 28 17:04 .tags

6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1

7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2

8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3

11 drwxrwxrwx 2 root system 8192 Sep 28 17:38 den_trash

5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group

4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user

10 -rw-r--r-- 1 root system 31114 Sep 28 17:32 sm1

12 -rw-r--r-- 1 root system 11646960 Sep 28 17:41 stripe1

#

# ls -li big3

8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3

#

# showfile -x big3

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

8.8001 2 16 1422 simple ** ** async 100% big3

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 1422 2 75504 22752

extentCnt: 1

#

#

# ls -li ./.tags/8

8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 ./.tags/8

#

#

# showfile -x .tags/8

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

AdvFS On-Disk Structures 2-65

Page 130: Dunix Student

Solutions

8.8001 2 16 1422 simple ** ** async 100% 8

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 1422 2 75504 22752

extentCnt: 1

#

2. Use the tag2name command, located in /sbin/advfs, to translate an AdvFS tag into the corresponding file name.

#

# tag2name .tags/8

/usr/dennis/big3

#

#

#

# tag2name -r bruden_dom dennis_fset 8

big3

#

3. Read the reference pages for nvbmtpg.

# man nvbmtpg

(...)

4. Either use the showfdmn program or generate a recursive directory listing of /etc/fdmns to determine the name of a virtual disk (SCSI or LSM block device) of an AdvFS file domain.

# ls -lR /etc/fdmns

total 4

-r-------- 1 root system 0 Sep 28 16:59 .advfslock_bruden_dom

-r-------- 1 root system 0 Sep 15 00:38 .advfslock_domain_dsk0g

-r-------- 1 root system 0 Sep 14 16:29 .advfslock_domain_dsk2g

-r-------- 1 root system 0 Sep 11 11:08 .advfslock_fdmns

-r-------- 1 root system 0 Sep 11 11:08 .advfslock_usr_domain

drwxr-xr-x 2 root system 512 Sep 28 17:08 bruden_dom

drwxr-xr-x 2 root system 512 Sep 14 16:30 domain_dsk0g

drwxr-xr-x 2 root system 512 Sep 14 16:37 domain_dsk2g

drwxr-xr-x 2 root system 512 Sep 11 11:08 usr_domain

/etc/fdmns/bruden_dom:

total 0

lrwxr-xr-x 1 root system 15 Sep 28 16:59 dsk0a -> /dev/disk/dsk0a

lrwxr-xr-x 1 root system 15 Sep 28 17:01 dsk0b -> /dev/disk/dsk0b

lrwxr-xr-x 1 root system 15 Sep 28 17:08 dsk2h -> /dev/disk/dsk2h

/etc/fdmns/domain_dsk0g:

total 0

lrwxr-xr-x 1 root system 15 Sep 14 16:30 dsk0g -> /dev/disk/dsk0g

/etc/fdmns/domain_dsk2g:

2-66 AdvFS On-Disk Structures

Page 131: Dunix Student

Solutions

total 0

lrwxrwxrwx 1 root system 15 Sep 14 16:37 dsk0g -> /dev/disk/dsk0g

lrwxr-xr-x 1 root system 15 Sep 14 16:28 dsk2g -> /dev/disk/dsk2g

/etc/fdmns/usr_domain:

total 0

lrwxr-xr-x 1 root system 15 Sep 11 11:08 dsk1g -> /dev/disk/dsk1g

5. Use the nvbmtpg command to look at the first two pages of the RBMT on your virtual disk. Save this information in a file. If a printer is available, you may want to print out this information.

# showfdmn bruden_dom

Id Date Created LogPgs Version Domain Name

37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom

Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name

1L 131072 65568 50% on 256 256 /dev/disk/dsk0a

2 262144 215648 18% on 256 256 /dev/disk/dsk0b

3 1858624 1801584 3% on 256 256 /dev/disk/dsk2h

---------- ---------- ------

2251840 2082800 8%

#

# nvbmtpg -rR bruden_dom 1 0

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 32 RBMT page 0

--------------------------------------------------------------------------

CELL 0 next mcell volume page cell 1 0 6 bfSetTag,tag -2,-6(RBMT)

RECORD 0 bCnt 92 BSR_ATTR

type BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTS

type BSXMT_APPEND chain mcell volume page cell 0 0 0

firstXtnt mcellCnt 1 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 32 (0x20)

bsXA[ 1] bsPage 1 vdBlk -1

--------------------------------------------------------------------------

CELL 1 next mcell volume page cell 0 0 0 bfSetTag,tag -2,-7 (SBM)

RECORD 0 bCnt 92 BSR_ATTR

type BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTS

type BSXMT_APPEND chain mcell volume page cell 0 0 0

firstXtnt mcellCnt 1 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 112 (0x70)

bsXA[ 1] bsPage 1 vdBlk -1

AdvFS On-Disk Structures 2-67

Page 132: Dunix Student

Solutions

--------------------------------------------------------------------------

CELL 2 next mcell volume page cell 0 0 0 bfSetTag,tag -2,-8 (TAG)

RECORD 0 bCnt 92 BSR_ATTR

type BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTS

type BSXMT_APPEND chain mcell volume page cell 0 0 0

firstXtnt mcellCnt 1 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 96 (0x60)

bsXA[ 1] bsPage 1 vdBlk -1

--------------------------------------------------------------------------

CELL 3 next mcell volume page cell 0 0 0 bfSetTag,tag -2,-9 (LOG)

RECORD 0 bCnt 92 BSR_ATTR

type BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTS

type BSXMT_APPEND chain mcell volume page cell 0 0 0

firstXtnt mcellCnt 1 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 128 (0x80)

bsXA[ 1] bsPage 512 vdBlk -1

--------------------------------------------------------------------------

CELL 4 next mcell volume page cell 0 0 0 bfSetTag,tag -2,-10 (BMT)

RECORD 0 bCnt 92 BSR_ATTR

type BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTS

type BSXMT_APPEND chain mcell volume page cell 0 0 0

firstXtnt mcellCnt 1 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 48 (0x30)

bsXA[ 1] bsPage 1 vdBlk -1

--------------------------------------------------------------------------

CELL 5 next mcell volume page cell 0 0 0 bfSetTag,tag -2,-11 (Misc)

RECORD 0 bCnt 92 BSR_ATTR

type BSRA_VALID

RECORD 1 bCnt 160 BSR_XTNTS

type BSXMT_APPEND chain mcell volume page cell 0 0 0

firstXtnt mcellCnt 0 xCnt 3

bsXA[ 0] bsPage 0 vdBlk 0 (0x0)

bsXA[ 1] bsPage 2 vdBlk 64 (0x40)

bsXA[ 2] bsPage 4 vdBlk -1

--------------------------------------------------------------------------

CELL 6 next mcell volume page cell 1 0 27 bfSetTag,tag -2,-6(RBMT)

RECORD 0 bCnt 40 BSR_VD_ATTR

2-68 AdvFS On-Disk Structures

Page 133: Dunix Student

Solutions

vdMntId 37f2025a.0008c86b (Wed Sep 29 08:13:14 1999)

vdIndex 1 vdBlkCnt 131072

RECORD 1 bCnt 24 BSR_DMN_ATTR

bfDomainId 37f12c39.000263ea (Tue Sep 28 16:59:37 1999)

RECORD 2 bCnt 52 BSR_DMN_MATTR

uid 0 gid 1 mode 0744

vdCnt 3

RECORD 3 bCnt 20 BSR_DMN_TRANS_ATTR

6. Use the nvbmtpg -c command to look at the mcell chains for the BMT and storage bitmap. Write down the extent map of the BMT. You’ll find it useful.

# nvbmtpg -rR bruden_dom 1 0 4 -c

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 32 RBMT page 0

--------------------------------------------------------------------------

CELL 4 next mcell volume page cell 0 0 0 bfSetTag,tag -2,-10 (BMT)

RECORD 0 bCnt 92 BSR_ATTR

type BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTS

type BSXMT_APPEND chain mcell volume page cell 0 0 0

firstXtnt mcellCnt 1 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 48 (0x30)

bsXA[ 1] bsPage 1 vdBlk -1

7. Use the showfile -x command with the BMT’s .tags directory entry as an argument to verify that you successfully completed the last exercise.

# showfile -x /usr/dennis/.tags/M-10

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

fffffff6.0000 1 16 1 simple ** ** ftx 100% M-10

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 1 1 48 16

extentCnt:

8. Try using nvbmtpg with the BMT’s .tags directory entry as an argument. You must have a fileset of the file domain successfully mounted to do this.

# nvbmtpg -r /usr/dennis/.tags/M-10

==========================================================================

FILE "/usr/dennis/.tags/M-10" RBMT page 0

--------------------------------------------------------------------------

There is 1 page in the BMT in this file.

The BMT uses 1 extents (out of 1) in 1 mcell.

#

AdvFS On-Disk Structures 2-69

Page 134: Dunix Student

Solutions

9. Verify that the data of your chosen files really is stored in the pages shown in the extent map by reading the data through the raw device that AdvFS uses to store the data. Here is one example of someone performing this exercise:

$ showfdmn -k play_dmn

Id Date Created LogPgs Domain Name3269046b.000bc403 Sat Oct 19 12:40:11 1996 512 play_dmn

Vol 1K-Blks Free % Used Cmode Rblks Wblks Vol Name 1 1067152 402864 62% on 128 128 /dev/rz3c 2L 1055509 387696 63% on 128 128 /dev/rz2c ---------- ---------- ------ 2122661 790560 63%

$ showfile -x bmtmisc.c

Id Vol PgSz Pages XtntType Segs SegSz Log Perf File bf48.8002 1 16 1 simple ** ** off 100% bmtmisc.c

extentMap: 1 pageOff pageCnt vol volBlock blockCnt 0 1 1 1601392 16 extentCnt: 1

# dd if=/dev/rrz3c of=temp ibs=512 iseek=1601392 count=16dd: 16+0 records in.dd: 16+0 records out.

# showfdmn bruden_dom

Id Date Created LogPgs Version Domain Name

37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom

Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name

1L 131072 65568 50% on 256 256 /dev/disk/dsk0a

2 262144 215648 18% on 256 256 /dev/disk/dsk0b

3 1858624 1801584 3% on 256 256 /dev/disk/dsk2h

---------- ---------- ------

2251840 2082800 8%

#

#

# showfile -x sm1

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

a.8001 2 16 3 simple ** ** async 100% sm1

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 3 2 121328 48

extentCnt: 1

#

2-70 AdvFS On-Disk Structures

Page 135: Dunix Student

Solutions

# dd if=/dev/rdisk/dsk0b of=/tmp/chunk1 ibs=512 iseek=121328 count=1

1+0 records in

1+0 records out

#

#

# cat /tmp/chunk1

#

# *****************************************************************

# * *

# * Copyright (c) Digital Equipment Corporation, 1991, 1999 *

# * *

# * All Rights Reserved. Unpublished rights reserved under *

# * the copyright laws of the United States. *

# * *

# * The software contained on t# #

#

# pg sm1

#

# *****************************************************************

# * *

# * Copyright (c) Digital Equipment Corporation, 1991, 1999 *

# * *

# * All Rights Reserved. Unpublished rights reserved under *

# * the copyright laws of the United States. *

# * *

# * The software contained on this media is proprietary to *

# * and embodies the confidential technology of Digital *

# * Equipment Corporation. Possession, use, duplication or *

# * dissemination of the software and media is authorized only *

# * pursuant to a valid written license from Digital Equipment *

# * Corporation. *

# * *

# * RESTRICTED RIGHTS LEGEND Use, duplication, or disclosure *

# * by the U.S. Government is subject to restrictions as set *

# * forth in Subparagraph (c)(1)(ii) of DFARS 252.227-7013, *

# * or in FAR 52.227-19, as applicable. *

# * *

# *****************************************************************

# HISTORY

10. Now create a striped file, at least 1 megabyte in size, and use nvbmtpg -c, and showfile to find its extent map. You must look into the BMTs of at least two virtual disks to obtain this information.

# showfile -x stripe1

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

c.8001 2 16 1422 stripe 2 8 async 100% stripe1

extentMap: 1

pageOff pageCnt volIndex volBlock blockCnt

0 8 3 1093536 11392

16 8

32 8

AdvFS On-Disk Structures 2-71

Page 136: Dunix Student

Solutions

(…)

1392 8

1408 8

extentCnt: 1

extentMap: 2

pageOff pageCnt volIndex volBlock blockCnt

8 8 1 80704 11360

24 8

40 8

(…)

1400 8

1416 6

extentCnt: 1

#

# ls -li stripe1

12 -rw-r--r-- 1 root system 11646960 Sep 28 17:41 stripe1

#

#

# nvbmtpg -r bruden_dom dennis_fset -t 12 -c

==========================================================================

DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0

--------------------------------------------------------------------------

CELL 12 next mcell volume page cell 0 0 0 bfSetTag,tag 2,12

RECORD 0 bCnt 92 BSR_ATTR

type BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTS

type BSXMT_STRIPE chain mcell volume page cell 3 0 8

RECORD 2 bCnt 92 BMTR_FS_STAT

st_mode 100644 (S_IFREG) st_uid 0 st_gid 0 st_size 11646960

st_nlink 1 dir_tag 2 st_mtime Tue Sep 28 17:41:55 1999

Extent mcells from BSR_XTNTS record chain pointer.

==========================================================================

DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 48 BMT page 0

--------------------------------------------------------------------------

CELL 8 next mcell volume page cell 1 0 18 bfSetTag,tag 2,12

RECORD 0 bCnt 260 BSR_SHADOW_XTNTS

allocVdIndex 3 mcellCnt 1 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 1093536 (0x10afa0)

bsXA[ 1] bsPage 712 vdBlk -1

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0

--------------------------------------------------------------------------

2-72 AdvFS On-Disk Structures

Page 137: Dunix Student

Solutions

CELL 18 next mcell volume page cell 0 0 0 bfSetTag,tag 2,12

RECORD 0 bCnt 260 BSR_SHADOW_XTNTS

allocVdIndex 1 mcellCnt 1 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 80704 (0x13b40)

bsXA[ 1] bsPage 710 vdBlk -1

#

11. Create a file with a few large holes and then look up its extent map. An easy way to create a file with holes is:

# dd if=/vmunix of=holesome.dat oseek=500 count=100# dd if=/vmunix of=holesome.dat oseek=1000 count=100

# dd if=/vmunix of=holesome.dat oseek=500 count=100

100+0 records in

100+0 records out

#

# dd if=/vmunix of=holesome.dat oseek=1000 count=100

100+0 records in

100+0 records out

#

# showfile -x holesome.dat

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

9.8002 1 16 14 simple ** ** async 33% holesome.dat

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

31 7 1 57600 112

62 1 1 57744 16

63 6 1 80512 96

extentCnt: 3

#

#

# nvbmtpg -r bruden_dom dennis_fset -t 9 -c

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0

--------------------------------------------------------------------------

CELL 17 next mcell volume page cell 0 0 0 bfSetTag,tag 2,9

RECORD 0 bCnt 92 BSR_ATTR

type BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTS

type BSXMT_APPEND chain mcell volume page cell 1 0 16

firstXtnt mcellCnt 2 xCnt 1

bsXA[ 0] bsPage 0 vdBlk -1

AdvFS On-Disk Structures 2-73

Page 138: Dunix Student

Solutions

RECORD 2 bCnt 92 BMTR_FS_STAT

st_mode 100644 (S_IFREG) st_uid 0 st_gid 0 st_size 563200

st_nlink 1 dir_tag 2 st_mtime Wed Sep 29 09:17:19 1999

Extent mcells from BSR_XTNTS record chain pointer.

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0

--------------------------------------------------------------------------

CELL 16 next mcell volume page cell 0 0 0 bfSetTag,tag 2,9

RECORD 0 bCnt 264 BSR_XTRA_XTNTS

xCnt 6

bsXA[ 0] bsPage 0 vdBlk -1

bsXA[ 1] bsPage 31 vdBlk 57600 (0xe100)

bsXA[ 2] bsPage 38 vdBlk -1

bsXA[ 3] bsPage 62 vdBlk 57744 (0xe190)

bsXA[ 4] bsPage 63 vdBlk 80512 (0x13a80)

bsXA[ 5] bsPage 69 vdBlk -1

12. Create a clone fileset. Use ls -i to verify that the tag numbers of the files in both the original and clone filesets are the same. Use nvbmtpg to verify that even the mcell IDs are unchanged.

# clonefset bruden_dom dennis_fset den_clone

#

#

# showfdmn bruden_dom

Id Date Created LogPgs Version Domain Name

37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom

Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name

1L 131072 64000 51% on 256 256 /dev/disk/dsk0a

2 262144 215584 18% on 256 256 /dev/disk/dsk0b

3 1858624 1801584 3% on 256 256 /dev/disk/dsk2h

---------- ---------- ------

2251840 2081168 8%

#

#

# showfsets bruden_dom

bruce_fset

Id : 37f12c39.000263ea.1.8001

Files : 6, SLim= 0, HLim= 0

Blocks (512) : 68288, SLim= 50000, HLim= 60000 grc= none

Quota Status : user=off group=off

dennis_fset

Id : 37f12c39.000263ea.2.8001

2-74 AdvFS On-Disk Structures

Page 139: Dunix Student

Solutions

Clone is : den_clone

Files : 10, SLim= 0, HLim= 0

Blocks (512) : 92618, SLim= 0, HLim= 0

Quota Status : user=off group=off

den_clone

Id : 37f12c39.000263ea.3.8003

Clone of : dennis_fset

Revision : 3

#

# mount bruden_dom#den_clone /usr/den_clone

#

# df -t advfs

Filesystem 512-blocks Used Available Capacity Mounted on

usr_domain#usr 1426112 1025780 294384 78% /usr

usr_domain#var 1426112 76026 294384 21% /var

bruden_dom#bruce_fset 50000 50000 0 100% /usr/bruce

bruden_dom#dennis_fset 2251840 92618 2081168 5% /usr/dennis

bruden_dom#den_clone 2251840 92618 2081168 5% /usr/den_clone

#

# ls -li /usr/dennis

total 45709

3 drwx------ 2 root system 8192 Sep 28 17:04 .tags

6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1

7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2

8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3

11 drwxrwxrwx 2 root system 8192 Sep 29 09:30 den_trash

9 -rw-r--r-- 1 root system 563200 Sep 29 09:17 holesome.dat

5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group

4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user

10 -rw-r--r-- 1 root system 62228 Sep 29 09:35 sm1

12 -rw-r--r-- 1 root system 11646960 Sep 28 17:41 stripe1

#

# ls -li /usr/den_clone

total 45709

3 drwx------ 2 root system 8192 Sep 28 17:04 .tags

6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1

7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2

8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3

11 drwxrwxrwx 2 root system 8192 Sep 29 09:30 den_trash

9 -rw-r--r-- 1 root system 563200 Sep 29 09:17 holesome.dat

5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group

4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user

10 -rw-r--r-- 1 root system 62228 Sep 29 09:35 sm1

12 -rw-r--r-- 1 root system 11646960 Sep 28 17:41 stripe1

#

# nvbmtpg -r bruden_dom dennis_fset -t 10 -c

AdvFS On-Disk Structures 2-75

Page 140: Dunix Student

Solutions

==========================================================================

DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0

--------------------------------------------------------------------------

CELL 10 next mcell volume page cell 0 0 0 bfSetTag,tag 2,10

RECORD 0 bCnt 92 BSR_ATTR

type BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTS

type BSXMT_APPEND chain mcell volume page cell 2 0 15

firstXtnt mcellCnt 2 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 121328 (0x1d9f0)

bsXA[ 1] bsPage 3 vdBlk -1

RECORD 2 bCnt 92 BMTR_FS_STAT

st_mode 100644 (S_IFREG) st_uid 0 st_gid 0 st_size 62228

st_nlink 1 dir_tag 2 st_mtime Wed Sep 29 09:35:34 1999

fragId.type BF_FRAG_5K fragId.frag 129

Extent mcells from BSR_XTNTS record chain pointer.

==========================================================================

DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0

--------------------------------------------------------------------------

CELL 15 next mcell volume page cell 0 0 0 bfSetTag,tag 2,10

RECORD 0 bCnt 264 BSR_XTRA_XTNTS

xCnt 3

bsXA[ 0] bsPage 3 vdBlk 75488 (0x126e0)

bsXA[ 1] bsPage 4 vdBlk 98256 (0x17fd0)

bsXA[ 2] bsPage 7 vdBlk -1

#

#

#

# nvbmtpg -r bruden_dom den_clone -t 10 -c

This clone file has no metadata.

13. Modify a large file in the original fileset by appending some blocks to the end of the file. (#cat some_file >> large_file) (or use dd instead with the oseek and count options). Now use ls -i and nvbmtpg to see that, while the tags are unchanged, the mcell IDs are now changed. Use showfile -x and nvbmtpg -c to look at the extent maps of both original and clone.

# cat /etc/disktab >> /usr/dennis/sm1

#

# ls -li /usr/dennis/sm1

10 -rw-r--r-- 1 root system 93342 Sep 29 10:39 /usr/dennis/sm1

#

# ls -li /usr/den_clone/sm1

2-76 AdvFS On-Disk Structures

Page 141: Dunix Student

Solutions

10 -rw-r--r-- 1 root system 62228 Sep 29 09:35 /usr/den_clone/sm1

#

# showfile -x /usr/dennis/sm1

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

a.8001 2 16 11 simple ** ** async 25% sm1

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 3 2 121328 48

3 1 2 75488 16

4 3 2 98256 48

7 4 2 75360 64

extentCnt: 4

#

# showfile -x /usr/den_clone/sm1

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

a.8001 3 16 0 simple ** ** async 100% sm1

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

extentCnt: 0

#

# nvbmtpg -r bruden_dom den_clone -t 10 -c

==========================================================================

DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 48 BMT page 0

--------------------------------------------------------------------------

CELL 5 next mcell volume page cell 0 0 0 bfSetTag,tag 3,10

RECORD 0 bCnt 92 BSR_ATTR

type BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTS

type BSXMT_APPEND chain mcell volume page cell 0 0 0

firstXtnt mcellCnt 1 xCnt 1

bsXA[ 0] bsPage 0 vdBlk -1

RECORD 2 bCnt 92 BMTR_FS_STAT

st_mode 100644 (S_IFREG) st_uid 0 st_gid 0 st_size 62228

st_nlink 1 dir_tag 2 st_mtime Wed Sep 29 09:35:34 1999

fragId.type BF_FRAG_5K fragId.frag 129

#

# nvbmtpg -r bruden_dom dennis_fset -t 10 -c

==========================================================================

DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0

AdvFS On-Disk Structures 2-77

Page 142: Dunix Student

Solutions

--------------------------------------------------------------------------

CELL 10 next mcell volume page cell 0 0 0 bfSetTag,tag 2,10

RECORD 0 bCnt 92 BSR_ATTR

type BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTS

type BSXMT_APPEND chain mcell volume page cell 2 0 15

firstXtnt mcellCnt 2 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 121328 (0x1d9f0)

bsXA[ 1] bsPage 3 vdBlk -1

RECORD 2 bCnt 92 BMTR_FS_STAT

st_mode 100644 (S_IFREG) st_uid 0 st_gid 0 st_size 93342

st_nlink 1 dir_tag 2 st_mtime Wed Sep 29 10:39:23 1999

fragId.type BF_FRAG_4K fragId.frag 260

Extent mcells from BSR_XTNTS record chain pointer.

==========================================================================

DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0

--------------------------------------------------------------------------

CELL 15 next mcell volume page cell 0 0 0 bfSetTag,tag 2,10

RECORD 0 bCnt 264 BSR_XTRA_XTNTS

xCnt 4

bsXA[ 0] bsPage 3 vdBlk 75488 (0x126e0)

bsXA[ 1] bsPage 4 vdBlk 98256 (0x17fd0)

bsXA[ 2] bsPage 7 vdBlk 75360 (0x12660)

bsXA[ 3] bsPage 11 vdBlk -1

14. Use nvbmtpg -c to look at mcell number 0 on page 0 of the BMT for the first volume of a domain. Note that the mcell free list is minimal. This is part of the dynamics built into AdvFS for Tru64 UNIX V5.

# nvbmtpg -r bruden_dom 3 0 0 -c

==========================================================================

DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 48 BMT page 0

--------------------------------------------------------------------------

CELL 0 next mcell volume page cell 0 0 0 bfSetTag,tag 0,0

RECORD 0 bCnt 8 BSR_MCELL_FREE_LIST

headPg 0

RECORD 1 bCnt 12 BSR_DEF_DEL_MCELL_LIST

nextMCId page,cell 0,0 prevMCId page,cell 0,0

15. Look at bitfile attributes and bitfile inheritable attributes for reserved files, nonreserved regular files, and finally nonreserved directories.

2-78 AdvFS On-Disk Structures

Page 143: Dunix Student

Solutions

#

# nvbmtpg -rv bruden_dom dennis_fset -t 1

==========================================================================

DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 10 nextFreePg -1 nextfreeMCId page,cell 0,18

--------------------------------------------------------------------------

CELL 4 linkSegment 0 bfSetTag 2 (2.8001) tag 1 (1.8001)

next mcell volume page cell 0 0 0

RECORD 0 bCnt 92 version 0 BSR_ATTR (2)

type BSRA_VALID (3)

bfPgSz 16 transitionId 2

cloneId 0 cloneCnt 3 maxClonePgs 0

deleteWithClone 0 outOfSyncClone 0

cl.dataSafety BFD_FTX_AGENT (2)

cl reqServices 1 optServices 0 extendSize 0 rsvd1 0

rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0

RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)

type BSXMT_APPEND (0)

chain mcell volume page cell 2 0 17

blksPerPage 16 segmentSize 0

delLink next page,cell 0,0 prev page,cell 0,0

delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0

firstXtnt mcellCnt 2 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 121392 (0x1da30)

bsXA[ 1] bsPage 48 vdBlk -1

#

# nvbmtpg -rv bruden_dom dennis_fset -t 10

==========================================================================

DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 10 nextFreePg -1 nextfreeMCId page,cell 0,18

--------------------------------------------------------------------------

CELL 10 linkSegment 0 bfSetTag 2 (2.8001) tag 10 (a.8001)

next mcell volume page cell 0 0 0

RECORD 0 bCnt 92 version 0 BSR_ATTR (2)

type BSRA_VALID (3)

bfPgSz 16 transitionId 940

cloneId 0 cloneCnt 3 maxClonePgs 0

deleteWithClone 0 outOfSyncClone 0

cl.dataSafety BFD_NIL (0)

cl reqServices 1 optServices 0 extendSize 0 rsvd1 0

rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0

RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)

AdvFS On-Disk Structures 2-79

Page 144: Dunix Student

Solutions

type BSXMT_APPEND (0)

chain mcell volume page cell 2 0 15

blksPerPage 16 segmentSize 0

delLink next page,cell 0,0 prev page,cell 0,0

delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0

firstXtnt mcellCnt 2 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 121328 (0x1d9f0)

bsXA[ 1] bsPage 3 vdBlk -1

RECORD 2 bCnt 92 version 0 BMTR_FS_STAT (255)

st_ino 10 st_mode 100644 (S_IFREG) st_nlink 1 st_size 124456

st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0

st_mtime Wed Sep 29 11:13:22 1999 st_umtime 538485000

st_atime Wed Sep 29 11:09:47 1999 st_uatime 126376000

st_ctime Wed Sep 29 11:13:22 1999 st_uctime 538485000

fragId.frag 386 fragId.type 2 BF_FRAG_2K fragPageOffset 15

dir_tag 2 (2.8001) st_flags 0 st_unused_1 983040 st_unused_2 0

#

# nvbmtpg -rv bruden_dom dennis_fset -t 14

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 4 nextFreePg -1 nextfreeMCId page,cell 0,24

--------------------------------------------------------------------------

CELL 22 linkSegment 0 bfSetTag 2 (2.8001) tag 14 (e.8001)

next mcell volume page cell 1 0 23

RECORD 0 bCnt 92 version 0 BSR_ATTR (2)

type BSRA_VALID (3)

bfPgSz 16 transitionId 269

cloneId 0 cloneCnt 3 maxClonePgs 0

deleteWithClone 0 outOfSyncClone 0

cl.dataSafety BFD_FTX_AGENT (2)

cl reqServices 1 optServices 0 extendSize 0 rsvd1 0

rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0

RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)

type BSXMT_APPEND (0)

chain mcell volume page cell 0 0 0

blksPerPage 16 segmentSize 0

delLink next page,cell 0,0 prev page,cell 0,0

delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0

firstXtnt mcellCnt 1 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 92080 (0x167b0)

bsXA[ 1] bsPage 1 vdBlk -1

RECORD 2 bCnt 64 version 0 BSR_BF_INHERIT_ATTR (16)

dataSafety BFD_NIL

reqServices 1

optServices 0

2-80 AdvFS On-Disk Structures

Page 145: Dunix Student

Solutions

extendSize 0

clientArea 0 0 0 0

rsvd1 0

rsvd2 0

rsvd_sec1 0

rsvd_sec2 0

rsvd_sec3 0

16. Look at bitfile attributes for a clone file with a modified original and then look at the bitfile attributes for the original.

# ls -li /usr/dennis/sm1

10 -rw-r--r-- 1 root system 124456 Sep 29 11:13 /usr/dennis/sm1

#

# ls -li /usr/den_clone/sm1

10 -rw-r--r-- 1 root system 62228 Sep 29 09:35 /usr/den_clone/sm1

#

# nvbmtpg -rv bruden_dom den_clone -t 10

==========================================================================

DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 48 BMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 19 nextFreePg -1 nextfreeMCId page,cell 0,10

--------------------------------------------------------------------------

CELL 5 linkSegment 0 bfSetTag 3 (3.8003) tag 10 (a.8001)

next mcell volume page cell 0 0 0

RECORD 0 bCnt 92 version 0 BSR_ATTR (2)

type BSRA_VALID (3)

bfPgSz 16 transitionId 234

cloneId 3 cloneCnt 0 maxClonePgs 7

deleteWithClone 0 outOfSyncClone 0

cl.dataSafety BFD_NIL (0)

cl reqServices 1 optServices 0 extendSize 0 rsvd1 0

rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0

RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)

type BSXMT_APPEND (0)

chain mcell volume page cell 0 0 0

blksPerPage 16 segmentSize 0

delLink next page,cell 0,0 prev page,cell 0,0

delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0

firstXtnt mcellCnt 1 xCnt 1

bsXA[ 0] bsPage 0 vdBlk -1

bsXA[ 1] bsPage 1 vdBlk -1

RECORD 2 bCnt 92 version 0 BMTR_FS_STAT (255)

st_ino 10 st_mode 100644 (S_IFREG) st_nlink 1 st_size 62228

st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0

st_mtime Wed Sep 29 09:35:34 1999 st_umtime 250400000

st_atime Wed Sep 29 09:02:24 1999 st_uatime 451571000

AdvFS On-Disk Structures 2-81

Page 146: Dunix Student

Solutions

st_ctime Wed Sep 29 09:35:34 1999 st_uctime 250400000

fragId.frag 129 fragId.type 5 BF_FRAG_5K fragPageOffset 7

dir_tag 2 (2.8001) st_flags 0 st_unused_1 458752 st_unused_2 0

#

# nvbmtpg -rv bruden_dom dennis_fset -t 10

==========================================================================

DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 10 nextFreePg -1 nextfreeMCId page,cell 0,18

--------------------------------------------------------------------------

CELL 10 linkSegment 0 bfSetTag 2 (2.8001) tag 10 (a.8001)

next mcell volume page cell 0 0 0

RECORD 0 bCnt 92 version 0 BSR_ATTR (2)

type BSRA_VALID (3)

bfPgSz 16 transitionId 940

cloneId 0 cloneCnt 3 maxClonePgs 0

deleteWithClone 0 outOfSyncClone 0

cl.dataSafety BFD_NIL (0)

cl reqServices 1 optServices 0 extendSize 0 rsvd1 0

rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0

RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)

type BSXMT_APPEND (0)

chain mcell volume page cell 2 0 15

blksPerPage 16 segmentSize 0

delLink next page,cell 0,0 prev page,cell 0,0

delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0

firstXtnt mcellCnt 2 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 121328 (0x1d9f0)

bsXA[ 1] bsPage 3 vdBlk -1

RECORD 2 bCnt 92 version 0 BMTR_FS_STAT (255)

st_ino 10 st_mode 100644 (S_IFREG) st_nlink 1 st_size 124456

st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0

st_mtime Wed Sep 29 11:13:22 1999 st_umtime 538485000

st_atime Wed Sep 29 11:09:47 1999 st_uatime 126376000

st_ctime Wed Sep 29 11:13:22 1999 st_uctime 538485000

fragId.frag 386 fragId.type 2 BF_FRAG_2K fragPageOffset 15

dir_tag 2 (2.8001) st_flags 0 st_unused_1 983040 st_unused_2 0

17. Print out the mcell records for an original and a cloned fileset.

# nvbmtpg -rv bruden_dom dennis_fset -c

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 4 nextFreePg -1 nextfreeMCId page,cell 0,24

2-82 AdvFS On-Disk Structures

Page 147: Dunix Student

Solutions

--------------------------------------------------------------------------

CELL 6 linkSegment 0 bfSetTag -2 (fffffffe.0) tag 2 (2.8001)

next mcell volume page cell 1 0 7

RECORD 0 bCnt 92 version 0 BSR_ATTR (2)

type BSRA_VALID (3)

bfPgSz 16 transitionId 2

cloneId 0 cloneCnt 0 maxClonePgs 0

deleteWithClone 0 outOfSyncClone 0

cl.dataSafety BFD_FTX_AGENT (2)

cl reqServices 1 optServices 0 extendSize 0 rsvd1 0

rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0

RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)

type BSXMT_APPEND (0)

chain mcell volume page cell 0 0 0

blksPerPage 16 segmentSize 0

delLink next page,cell 0,0 prev page,cell 0,0

delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0

firstXtnt mcellCnt 1 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 34688 (0x8780)

bsXA[ 1] bsPage 8 vdBlk -1

RECORD 2 bCnt 64 version 0 BSR_BFS_QUOTA_ATTR (18)

blkHLimitHi,blkHLimitLo 0,30d40 (200000)

blkSLimitHi,blkSLimitLo 0,0 (0)

fileHLimitHi,fileHLimitLo 0,0 (0)

fileSLimitHi,fileSLimitLo 0,0 (0)

blkTLimit 0, fileTLimit 0, quotaStatus 1420

unused1 0, unused2 0, unused3 0, unused4 0

--------------------------------------------------------------------------

--------------------------------------------------------------------------

CELL 7 linkSegment 1 bfSetTag -2 (fffffffe.0) tag 2 (2.8001)

next mcell volume page cell 0 0 0

RECORD 0 bCnt 220 version 0 BSR_BFS_ATTR (8)

bfSetId.domainId 37f12c39.263ea (Tue Sep 28 16:59:37 1999)

bfSetId.dirTag 2 (2.8001)

fragBfTag 1 (1.8001)

nextCloneSetTag 3 (3.8003) origSetTag 0 (0.0)

nxtDelPendingBfSet 0 (0.0)

state BFS_READY flags 0x0

cloneId 0 cloneCnt 3 numClones 1

fsDev 0xaf0db242 freeFragGrps 2 oldQuotaStatus 0

uid 0 gid 1 mode 0744 setName "dennis_fset"

fsContext[0], fsContext[1] 2.8001 (rootTag)

fsContext[2], fsContext[3] 3.8001 (tagsTag)

fsContext[4], fsContext[5] 4.8001 (userQuotaTag)

fsContext[6], fsContext[7] 5.8001 (groupQuotaTag)

fragGrps[0] firstFreeGrp 64 lastFreeGrp 32

fragGrps[1] firstFreeGrp -1 lastFreeGrp -1

fragGrps[2] firstFreeGrp 48 lastFreeGrp 48

fragGrps[3] firstFreeGrp -1 lastFreeGrp -1

AdvFS On-Disk Structures 2-83

Page 148: Dunix Student

Solutions

fragGrps[4] firstFreeGrp 32 lastFreeGrp 32

fragGrps[5] firstFreeGrp 16 lastFreeGrp 16

fragGrps[6] firstFreeGrp -1 lastFreeGrp -1

fragGrps[7] firstFreeGrp 0 lastFreeGrp 0

RECORD 1 bCnt 36 version 0 BSR_SET_SHELVE_ATTR (17)

flags MSS_NO_SHELVE (0x4)

smallFile 5

readAhead 0

readAheadIncr 5

readAheadMax 50

autoShelveThresh 100

userId 0

shelf 0

# nvbmtpg -rv bruden_dom den_clone -c

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 4 nextFreePg -1 nextfreeMCId page,cell 0,24

--------------------------------------------------------------------------

CELL 19 linkSegment 0 bfSetTag -2 (fffffffe.0) tag 3 (3.8003)

next mcell volume page cell 1 0 14

RECORD 0 bCnt 92 version 0 BSR_ATTR (2)

type BSRA_VALID (3)

bfPgSz 16 transitionId 154

cloneId 0 cloneCnt 0 maxClonePgs 0

deleteWithClone 0 outOfSyncClone 0

cl.dataSafety BFD_FTX_AGENT (2)

cl reqServices 1 optServices 0 extendSize 0 rsvd1 0

rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0

RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)

type BSXMT_APPEND (0)

chain mcell volume page cell 1 0 20

blksPerPage 16 segmentSize 0

delLink next page,cell 0,0 prev page,cell 0,0

delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0

firstXtnt mcellCnt 2 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 57712 (0xe170)

bsXA[ 1] bsPage 2 vdBlk -1

RECORD 2 bCnt 64 version 0 BSR_BFS_QUOTA_ATTR (18)

blkHLimitHi,blkHLimitLo 0,0 (0)

blkSLimitHi,blkSLimitLo 0,0 (0)

fileHLimitHi,fileHLimitLo 0,0 (0)

fileSLimitHi,fileSLimitLo 0,0 (0)

blkTLimit 0, fileTLimit 0, quotaStatus 0

unused1 0, unused2 0, unused3 0, unused4 0

--------------------------------------------------------------------------

2-84 AdvFS On-Disk Structures

Page 149: Dunix Student

Solutions

--------------------------------------------------------------------------

CELL 14 linkSegment 1 bfSetTag -2 (fffffffe.0) tag 3 (3.8003)

next mcell volume page cell 0 0 0

RECORD 0 bCnt 220 version 0 BSR_BFS_ATTR (8)

bfSetId.domainId 37f12c39.263ea (Tue Sep 28 16:59:37 1999)

bfSetId.dirTag 3 (3.8003)

fragBfTag 1 (1.8001)

nextCloneSetTag 0 (0.0) origSetTag 2 (2.8001)

nxtDelPendingBfSet 0 (0.0)

state BFS_READY flags 0x0

cloneId 3 cloneCnt 0 numClones 0

fsDev 0xdc8e9f23 freeFragGrps 0 oldQuotaStatus 0

uid 0 gid 1 mode 0744 setName "den_clone"

fsContext[0], fsContext[1] 2.8001 (rootTag)

fsContext[2], fsContext[3] 3.8001 (tagsTag)

fsContext[4], fsContext[5] 4.8001 (userQuotaTag)

fsContext[6], fsContext[7] 5.8001 (groupQuotaTag)

fragGrps[0] firstFreeGrp 32 lastFreeGrp 32

fragGrps[1] firstFreeGrp -1 lastFreeGrp -1

fragGrps[2] firstFreeGrp -1 lastFreeGrp -1

fragGrps[3] firstFreeGrp -1 lastFreeGrp -1

fragGrps[4] firstFreeGrp -1 lastFreeGrp -1

fragGrps[5] firstFreeGrp 16 lastFreeGrp 16

fragGrps[6] firstFreeGrp -1 lastFreeGrp -1

fragGrps[7] firstFreeGrp 0 lastFreeGrp 0

RECORD 1 bCnt 36 version 0 BSR_SET_SHELVE_ATTR (17)

flags MSS_NO_SHELVE (0x4)

smallFile 5

readAhead 0

readAheadIncr 5

readAheadMax 50

autoShelveThresh 100

userId 0

shelf 0

Extent mcells from BSR_XTNTS record chain pointer.

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 4 nextFreePg -1 nextfreeMCId page,cell 0,24

--------------------------------------------------------------------------

CELL 20 linkSegment 0 bfSetTag -2 (fffffffe.0) tag 3 (3.8003)

next mcell volume page cell 0 0 0

RECORD 0 bCnt 264 version 0 BSR_XTRA_XTNTS (5)

blksPerPage 16

xCnt 2

bsXA[ 0] bsPage 2 vdBlk 80608 (0x13ae0)

bsXA[ 1] bsPage 8 vdBlk -1

bsXA[ 2] bsPage 0 vdBlk 0 (0x0)

AdvFS On-Disk Structures 2-85

Page 150: Dunix Student

Solutions

bsXA[ 3] bsPage 0 vdBlk 0 (0x0)

bsXA[ 4] bsPage 0 vdBlk 0 (0x0)

bsXA[ 5] bsPage 0 vdBlk 0 (0x0)

bsXA[ 6] bsPage 0 vdBlk 0 (0x0)

bsXA[ 7] bsPage 0 vdBlk 0 (0x0)

bsXA[ 8] bsPage 0 vdBlk 0 (0x0)

bsXA[ 9] bsPage 0 vdBlk 0 (0x0)

bsXA[10] bsPage 0 vdBlk 65616 (0x10050)

bsXA[11] bsPage 2 vdBlk 1 (0x1)

bsXA[12] bsPage 19 vdBlk 0 (0x0)

bsXA[13] bsPage 0 vdBlk 16 (0x10)

bsXA[14] bsPage 8 vdBlk 0 (0x0)

bsXA[15] bsPage 0 vdBlk 0 (0x0)

bsXA[16] bsPage 0 vdBlk -1

bsXA[17] bsPage 0 vdBlk 0 (0x0)

bsXA[18] bsPage 0 vdBlk 0 (0x0)

bsXA[19] bsPage 0 vdBlk 0 (0x0)

bsXA[20] bsPage 0 vdBlk 4 (0x4)

bsXA[21] bsPage 0 vdBlk 0 (0x0)

bsXA[22] bsPage 0 vdBlk 0 (0x0)

bsXA[23] bsPage 0 vdBlk 0 (0x0)

bsXA[24] bsPage 0 vdBlk 0 (0x0)

bsXA[25] bsPage 0 vdBlk 0 (0x0)

bsXA[26] bsPage 0 vdBlk 0 (0x0)

bsXA[27] bsPage 0 vdBlk 0 (0x0)

bsXA[28] bsPage 0 vdBlk 0 (0x0)

bsXA[29] bsPage 0 vdBlk 0 (0x0)

bsXA[30] bsPage 0 vdBlk 0 (0x0)

bsXA[31] bsPage 0 vdBlk 0 (0x0)

18. This exercise is difficult. Try deleting a fileset with lots of large files. While the deletion is in progress, examine the fileset mcell record to look at the progress of the deletion through the delete pending change.

Use nvbmtpg to try to catch the action.

19. Convince yourself that executing the chfsets command really does result in modification of the appropriate fileset attributes record.

# nvbmtpg -rv bruden_dom dennis_fset -c

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 4 nextFreePg -1 nextfreeMCId page,cell 0,24

--------------------------------------------------------------------------

CELL 6 linkSegment 0 bfSetTag -2 (fffffffe.0) tag 2 (2.8001)

next mcell volume page cell 1 0 7

RECORD 0 bCnt 92 version 0 BSR_ATTR (2)

type BSRA_VALID (3)

bfPgSz 16 transitionId 2

cloneId 0 cloneCnt 0 maxClonePgs 0

2-86 AdvFS On-Disk Structures

Page 151: Dunix Student

Solutions

deleteWithClone 0 outOfSyncClone 0

cl.dataSafety BFD_FTX_AGENT (2)

cl reqServices 1 optServices 0 extendSize 0 rsvd1 0

rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0

RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)

type BSXMT_APPEND (0)

chain mcell volume page cell 0 0 0

blksPerPage 16 segmentSize 0

delLink next page,cell 0,0 prev page,cell 0,0

delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0

firstXtnt mcellCnt 1 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 34688 (0x8780)

bsXA[ 1] bsPage 8 vdBlk -1

RECORD 2 bCnt 64 version 0 BSR_BFS_QUOTA_ATTR (18)

blkHLimitHi,blkHLimitLo 0,30d40 (200000)

blkSLimitHi,blkSLimitLo 0,0 (0)

fileHLimitHi,fileHLimitLo 0,0 (0)

fileSLimitHi,fileSLimitLo 0,0 (0)

blkTLimit 0, fileTLimit 0, quotaStatus 1420

unused1 0, unused2 0, unused3 0, unused4 0

--------------------------------------------------------------------------

--------------------------------------------------------------------------

CELL 7 linkSegment 1 bfSetTag -2 (fffffffe.0) tag 2 (2.8001)

next mcell volume page cell 0 0 0

RECORD 0 bCnt 220 version 0 BSR_BFS_ATTR (8)

bfSetId.domainId 37f12c39.263ea (Tue Sep 28 16:59:37 1999)

bfSetId.dirTag 2 (2.8001)

fragBfTag 1 (1.8001)

nextCloneSetTag 3 (3.8003) origSetTag 0 (0.0)

nxtDelPendingBfSet 0 (0.0)

state BFS_READY flags 0x0

cloneId 0 cloneCnt 3 numClones 1

fsDev 0xaf0db242 freeFragGrps 2 oldQuotaStatus 0

uid 0 gid 1 mode 0744 setName "dennis_fset"

fsContext[0], fsContext[1] 2.8001 (rootTag)

fsContext[2], fsContext[3] 3.8001 (tagsTag)

fsContext[4], fsContext[5] 4.8001 (userQuotaTag)

fsContext[6], fsContext[7] 5.8001 (groupQuotaTag)

fragGrps[0] firstFreeGrp 64 lastFreeGrp 32

fragGrps[1] firstFreeGrp -1 lastFreeGrp -1

fragGrps[2] firstFreeGrp 48 lastFreeGrp 48

fragGrps[3] firstFreeGrp -1 lastFreeGrp -1

fragGrps[4] firstFreeGrp 32 lastFreeGrp 32

fragGrps[5] firstFreeGrp 16 lastFreeGrp 16

fragGrps[6] firstFreeGrp -1 lastFreeGrp -1

fragGrps[7] firstFreeGrp 0 lastFreeGrp 0

RECORD 1 bCnt 36 version 0 BSR_SET_SHELVE_ATTR (17)

flags MSS_NO_SHELVE (0x4)

smallFile 5

AdvFS On-Disk Structures 2-87

Page 152: Dunix Student

Solutions

readAhead 0

readAheadIncr 5

readAheadMax 50

autoShelveThresh 100

userId 0

shelf 0

#

#

# chfsets -b 200000 bruden_dom dennis_fset

chfsets: At least one fileset in this domain must be mounted.

#

# mount bruden_dom#dennis_fset /usr/dennis

#

#

# mount bruden_dom#bruce_fset /usr/bruce

#

# mount bruden_dom#den_clone /usr/den_clone

#

#

# df -t advfs

Filesystem 512-blocks Used Available Capacity Mounted on

usr_domain#usr 1426112 1025888 235376 82% /usr

usr_domain#var 1426112 134972 235376 37% /var

bruden_dom#dennis_fset 200000 92940 107060 47% /usr/dennis

bruden_dom#bruce_fset 50000 50000 0 100% /usr/bruce

bruden_dom#den_clone 2251840 92618 2079520 5% /usr/den_clone

#

#

# chfsets -b 200000 bruden_dom dennis_fset

dennis_fset

Id : 37f12c39.000263ea.2.8001

Block H Limit: 100000 --> 200000

#

# nvbmtpg -rv bruden_dom dennis_fset -c

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 4 nextFreePg -1 nextfreeMCId page,cell 0,24

--------------------------------------------------------------------------

CELL 6 linkSegment 0 bfSetTag -2 (fffffffe.0) tag 2 (2.8001)

next mcell volume page cell 1 0 7

RECORD 0 bCnt 92 version 0 BSR_ATTR (2)

type BSRA_VALID (3)

2-88 AdvFS On-Disk Structures

Page 153: Dunix Student

Solutions

bfPgSz 16 transitionId 2

cloneId 0 cloneCnt 0 maxClonePgs 0

deleteWithClone 0 outOfSyncClone 0

cl.dataSafety BFD_FTX_AGENT (2)

cl reqServices 1 optServices 0 extendSize 0 rsvd1 0

rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0

RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)

type BSXMT_APPEND (0)

chain mcell volume page cell 0 0 0

blksPerPage 16 segmentSize 0

delLink next page,cell 0,0 prev page,cell 0,0

delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0

firstXtnt mcellCnt 1 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 34688 (0x8780)

bsXA[ 1] bsPage 8 vdBlk -1

RECORD 2 bCnt 64 version 0 BSR_BFS_QUOTA_ATTR (18)

blkHLimitHi,blkHLimitLo 0,30d40 (200000)

blkSLimitHi,blkSLimitLo 0,0 (0)

fileHLimitHi,fileHLimitLo 0,0 (0)

fileSLimitHi,fileSLimitLo 0,0 (0)

blkTLimit 0, fileTLimit 0, quotaStatus 1420

unused1 0, unused2 0, unused3 0, unused4 0

--------------------------------------------------------------------------

--------------------------------------------------------------------------

CELL 7 linkSegment 1 bfSetTag -2 (fffffffe.0) tag 2 (2.8001)

next mcell volume page cell 0 0 0

RECORD 0 bCnt 220 version 0 BSR_BFS_ATTR (8)

bfSetId.domainId 37f12c39.263ea (Tue Sep 28 16:59:37 1999)

bfSetId.dirTag 2 (2.8001)

fragBfTag 1 (1.8001)

nextCloneSetTag 3 (3.8003) origSetTag 0 (0.0)

nxtDelPendingBfSet 0 (0.0)

state BFS_READY flags 0x0

cloneId 0 cloneCnt 3 numClones 1

fsDev 0xaf0db242 freeFragGrps 2 oldQuotaStatus 0

uid 0 gid 1 mode 0744 setName "dennis_fset"

fsContext[0], fsContext[1] 2.8001 (rootTag)

fsContext[2], fsContext[3] 3.8001 (tagsTag)

fsContext[4], fsContext[5] 4.8001 (userQuotaTag)

fsContext[6], fsContext[7] 5.8001 (groupQuotaTag)

fragGrps[0] firstFreeGrp 64 lastFreeGrp 32

fragGrps[1] firstFreeGrp -1 lastFreeGrp -1

fragGrps[2] firstFreeGrp 48 lastFreeGrp 48

fragGrps[3] firstFreeGrp -1 lastFreeGrp -1

fragGrps[4] firstFreeGrp 32 lastFreeGrp 32

fragGrps[5] firstFreeGrp 16 lastFreeGrp 16

fragGrps[6] firstFreeGrp -1 lastFreeGrp -1

fragGrps[7] firstFreeGrp 0 lastFreeGrp 0

RECORD 1 bCnt 36 version 0 BSR_SET_SHELVE_ATTR (17)

AdvFS On-Disk Structures 2-89

Page 154: Dunix Student

Solutions

flags MSS_NO_SHELVE (0x4)

smallFile 5

readAhead 0

readAheadIncr 5

readAheadMax 50

autoShelveThresh 100

userId 0

shelf 0

20. Examine the domain attribute and virtual disk records for your AdvFS disks.

# nvbmtpg -r bruden_dom -f

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0

--------------------------------------------------------------------------

There is 1 page in the BMT on this volume.

The BMT uses 1 extents (out of 1) in 1 mcell.

There are 1

pages on the free list with a total of 4 free mcells.

==========================================================================

DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0

--------------------------------------------------------------------------

There is 1 page in the BMT on this volume.

The BMT uses 1 extents (out of 1) in 1 mcell.

There are 1

pages on the free list with a total of 10 free mcells.

==========================================================================

DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 48 BMT page 0

--------------------------------------------------------------------------

There is 1 page in the BMT on this volume.

The BMT uses 1 extents (out of 1) in 1 mcell.

There are 1

pages on the free list with a total of 19 free mcells.

#

#

# nvbmtpg -rRv bruden_dom 1 0 6

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 32 RBMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 20 nextFreePg 0 nextfreeMCId page,cell 0,7

--------------------------------------------------------------------------

CELL 6 linkSegment 1 bfSetTag -2 (fffffffe.0) tag -6 (fffffffa.0)(RBMT)

next mcell volume page cell 1 0 27

RECORD 0 bCnt 40 version 0 BSR_VD_ATTR (3)

vdMntId 37f2795f.000c98fb (Wed Sep 29 16:41:03 1999)

state 1

vdIndex 1

jays_new_field 0

vdBlkCnt 131072

2-90 AdvFS On-Disk Structures

Page 155: Dunix Student

Solutions

stgCluster 16

maxPgSz 16

bmtXtntPgs 128

serviceClass 1

RECORD 1 bCnt 24 version 0 BSR_DMN_ATTR (4)

bfDomainId 37f12c39.000263ea (Tue Sep 28 16:59:37 1999)

maxVds 256

bfSetDirTag -2 (fffffffe.0)

RECORD 2 bCnt 52 version 0 BSR_DMN_MATTR (15)

seqNum 1

delPendingBfSet 0 (0.0)

uid 0

gid 1

mode 0744

vdCnt 3

recoveryFailed 0

bfSetDirTag -8

ftxLogTag -9

ftxLogPgs 512

RECORD 3 bCnt 20 version 0 BSR_DMN_TRANS_ATTR (21)

chainVdIndex -1

chainMCId page,cell 0,0

op 0

dev 0x0

#

# nvbmtpg -rRv bruden_dom 2 0 6

==========================================================================

DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 32 RBMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 20 nextFreePg 0 nextfreeMCId page,cell 0,7

--------------------------------------------------------------------------

CELL 6 linkSegment 1 bfSetTag -2 (fffffffe.0) tag -12 (fffffff4.0)(RBMT)

next mcell volume page cell 2 0 27

RECORD 0 bCnt 40 version 0 BSR_VD_ATTR (3)

vdMntId 37f2795f.000c98fb (Wed Sep 29 16:41:03 1999)

state 1

vdIndex 2

jays_new_field 0

vdBlkCnt 262144

stgCluster 16

maxPgSz 16

bmtXtntPgs 128

serviceClass 1

RECORD 1 bCnt 24 version 0 BSR_DMN_ATTR (4)

AdvFS On-Disk Structures 2-91

Page 156: Dunix Student

Solutions

bfDomainId 37f12c39.000263ea (Tue Sep 28 16:59:37 1999)

maxVds 256

bfSetDirTag -2 (fffffffe.0)

#

# nvbmtpg -rRv bruden_dom 3 0 6

==========================================================================

DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 32 RBMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 20 nextFreePg 0 nextfreeMCId page,cell 0,7

--------------------------------------------------------------------------

CELL 6 linkSegment 1 bfSetTag -2 (fffffffe.0) tag -18 (ffffffee.0)(RBMT)

next mcell volume page cell 3 0 27

RECORD 0 bCnt 40 version 0 BSR_VD_ATTR (3)

vdMntId 37f2795f.000c98fb (Wed Sep 29 16:41:03 1999)

state 1

vdIndex 3

jays_new_field 0

vdBlkCnt 1858632

stgCluster 16

maxPgSz 16

bmtXtntPgs 128

serviceClass 1

RECORD 1 bCnt 24 version 0 BSR_DMN_ATTR (4)

bfDomainId 37f12c39.000263ea (Tue Sep 28 16:59:37 1999)

maxVds 256

bfSetDirTag -2 (fffffffe.0)

21. Examine the BMT POSIX file stat records for a few of your files. See how the information stored in this record is reflected in the output of ls -l.

# nvbmtpg -rv bruden_dom dennis_fset -t 10

==========================================================================

DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 10 nextFreePg -1 nextfreeMCId page,cell 0,18

--------------------------------------------------------------------------

CELL 10 linkSegment 0 bfSetTag 2 (2.8001) tag 10 (a.8001)

next mcell volume page cell 0 0 0

RECORD 0 bCnt 92 version 0 BSR_ATTR (2)

type BSRA_VALID (3)

bfPgSz 16 transitionId 940

cloneId 0 cloneCnt 3 maxClonePgs 0

deleteWithClone 0 outOfSyncClone 0

cl.dataSafety BFD_NIL (0)

cl reqServices 1 optServices 0 extendSize 0 rsvd1 0

2-92 AdvFS On-Disk Structures

Page 157: Dunix Student

Solutions

rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0

RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)

type BSXMT_APPEND (0)

chain mcell volume page cell 2 0 15

blksPerPage 16 segmentSize 0

delLink next page,cell 0,0 prev page,cell 0,0

delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0

firstXtnt mcellCnt 2 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 121328 (0x1d9f0)

bsXA[ 1] bsPage 3 vdBlk -1

RECORD 2 bCnt 92 version 0 BMTR_FS_STAT (255)

st_ino 10 st_mode 100644 (S_IFREG) st_nlink 1 st_size 124456

st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0

st_mtime Wed Sep 29 11:13:22 1999 st_umtime 538485000

st_atime Wed Sep 29 11:09:47 1999 st_uatime 126376000

st_ctime Wed Sep 29 11:13:22 1999 st_uctime 538485000

fragId.frag 386 fragId.type 2 BF_FRAG_2K fragPageOffset 15

dir_tag 2 (2.8001) st_flags 0 st_unused_1 983040 st_unused_2 0

#

# pwd

/usr/bruden/advfs

#

# cd /usr/dennis

#

# ls -li sm1

10 -rw-r--r-- 1 root system 124456 Sep 29 11:13 sm1

#

#

#

# (0644 is rw-r--r--)

22. Now create some symbolic links and examine the corresponding BMT fast symbolic link records.

# pwd

/usr/dennis

#

# ln -s /etc/passwd pw

#

# ls -li pw

16 lrwxrwxrwx 1 root system 11 Sep 29 17:24 pw -> /etc/passwd

#

# nvbmtpg -rv bruden_dom dennis_fset -t 16 -c

AdvFS On-Disk Structures 2-93

Page 158: Dunix Student

Solutions

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 2 nextFreePg -1 nextfreeMCId page,cell 0,26

--------------------------------------------------------------------------

CELL 24 linkSegment 0 bfSetTag 2 (2.8001) tag 16 (10.8001)

next mcell volume page cell 1 0 25

RECORD 0 bCnt 92 version 0 BSR_ATTR (2)

type BSRA_VALID (3)

bfPgSz 16 transitionId 113

cloneId 0 cloneCnt 3 maxClonePgs 0

deleteWithClone 0 outOfSyncClone 0

cl.dataSafety BFD_NIL (0)

cl reqServices 1 optServices 0 extendSize 0 rsvd1 0

rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0

RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)

type BSXMT_APPEND (0)

chain mcell volume page cell 0 0 0

blksPerPage 16 segmentSize 0

delLink next page,cell 0,0 prev page,cell 0,0

delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0

firstXtnt mcellCnt 1 xCnt 1

bsXA[ 0] bsPage 0 vdBlk -1

bsXA[ 1] bsPage 0 vdBlk 0 (0x0)

RECORD 2 bCnt 92 version 0 BMTR_FS_STAT (255)

st_ino 16 st_mode 120777 (S_IFLNK) st_nlink 1 st_size 11

st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0

st_mtime Wed Sep 29 17:24:25 1999 st_umtime 167392000

st_atime Wed Sep 29 17:24:31 1999 st_uatime 363681000

st_ctime Wed Sep 29 17:24:25 1999 st_uctime 167392000

fragId.frag 0 fragId.type 0 BF_FRAG_ANY fragPageOffset 0

dir_tag 2 (2.8001) st_flags 0 st_unused_1 0 st_unused_2 0

--------------------------------------------------------------------------

--------------------------------------------------------------------------

CELL 25 linkSegment 1 bfSetTag 2 (2.8001) tag 16 (10.8001)

next mcell volume page cell 0 0 0

RECORD 0 bCnt 15 version 0 BMTR_FS_DATA (254)

/etc/passwd

23. Use mktrashcan to create a trashcan directory and then examine the BMT record that points to the trash.

# shtrashcan /usr/dennis

’/usr/dennis/den_trash’ attached to ’/usr/dennis’

#

#

2-94 AdvFS On-Disk Structures

Page 159: Dunix Student

Solutions

# ls -lid /usr/dennis

2 drwxrwxrwx 5 root system 8192 Sep 29 17:24 /usr/dennis

#

# nvbmtpg -rv bruden_dom dennis_fset -t 2 -c

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 2 nextFreePg -1 nextfreeMCId page,cell 0,26

--------------------------------------------------------------------------

CELL 8 linkSegment 0 bfSetTag 2 (2.8001) tag 2 (2.8001)

next mcell volume page cell 1 0 9

RECORD 0 bCnt 92 version 0 BSR_ATTR (2)

type BSRA_VALID (3)

bfPgSz 16 transitionId 2

cloneId 0 cloneCnt 3 maxClonePgs 0

deleteWithClone 0 outOfSyncClone 0

cl.dataSafety BFD_FTX_AGENT (2)

cl reqServices 1 optServices 0 extendSize 0 rsvd1 0

rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0

RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)

type BSXMT_APPEND (0)

chain mcell volume page cell 0 0 0

blksPerPage 16 segmentSize 0

delLink next page,cell 0,0 prev page,cell 0,0

delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0

firstXtnt mcellCnt 1 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 34816 (0x8800)

bsXA[ 1] bsPage 1 vdBlk -1

RECORD 2 bCnt 64 version 0 BSR_BF_INHERIT_ATTR (16)

dataSafety BFD_NIL

reqServices 1

optServices 0

extendSize 0

clientArea 0 0 0 0

rsvd1 0

rsvd2 0

rsvd_sec1 0

rsvd_sec2 0

rsvd_sec3 0

RECORD 3 bCnt 8 version 0 BMTR_FS_TIME (251)

last sync Wed Sep 29 16:41:04 1999

RECORD 4 bCnt 12 version 0 BMTR_FS_UNDEL_DIR (252)

dir_tag 11 (b.8001)

--------------------------------------------------------------------------

AdvFS On-Disk Structures 2-95

Page 160: Dunix Student

Solutions

--------------------------------------------------------------------------

CELL 9 linkSegment 1 bfSetTag 2 (2.8001) tag 2 (2.8001)

next mcell volume page cell 0 0 0

RECORD 0 bCnt 92 version 0 BMTR_FS_STAT (255)

st_ino 2 st_mode 40777 (S_IFDIR) st_nlink 5 st_size 8192

st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0

st_mtime Wed Sep 29 17:24:25 1999 st_umtime 167392000

st_atime Wed Sep 29 16:00:36 1999 st_uatime 968173000

st_ctime Wed Sep 29 17:24:25 1999 st_uctime 167392000

fragId.frag 0 fragId.type 0 BF_FRAG_ANY fragPageOffset 0

dir_tag 2 (2.8001) st_flags 0 st_unused_1 0 st_unused_2 0

24. Find the mcell ID of a file system root and examine its BMT file system time records.

# ls -lid /usr/bruce

2 drwxrwxrwx 3 root system 8192 Sep 28 17:24 /usr/bruce

#

# nvbmtpg -rv bruden_dom bruce_fset -t 2 -c

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0

--------------------------------------------------------------------------

pageId 0 megaVersion 4

freeMcellCnt 2 nextFreePg -1 nextfreeMCId page,cell 0,26

--------------------------------------------------------------------------

CELL 3 linkSegment 0 bfSetTag 1 (1.8001) tag 2 (2.8001)

next mcell volume page cell 1 0 4

RECORD 0 bCnt 92 version 0 BSR_ATTR (2)

type BSRA_VALID (3)

bfPgSz 16 transitionId 2

cloneId 0 cloneCnt 0 maxClonePgs 0

deleteWithClone 0 outOfSyncClone 0

cl.dataSafety BFD_FTX_AGENT (2)

cl reqServices 1 optServices 0 extendSize 0 rsvd1 0

rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0

RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)

type BSXMT_APPEND (0)

chain mcell volume page cell 0 0 0

blksPerPage 16 segmentSize 0

delLink next page,cell 0,0 prev page,cell 0,0

delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0

firstXtnt mcellCnt 1 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 34656 (0x8760)

bsXA[ 1] bsPage 1 vdBlk -1

RECORD 2 bCnt 64 version 0 BSR_BF_INHERIT_ATTR (16)

dataSafety BFD_NIL

reqServices 1

optServices 0

2-96 AdvFS On-Disk Structures

Page 161: Dunix Student

Solutions

extendSize 0

clientArea 0 0 0 0

rsvd1 0

rsvd2 0

rsvd_sec1 0

rsvd_sec2 0

rsvd_sec3 0

RECORD 3 bCnt 8 version 0 BMTR_FS_TIME (251)

last sync Wed Sep 29 16:41:26 1999

--------------------------------------------------------------------------

--------------------------------------------------------------------------

CELL 4 linkSegment 1 bfSetTag 1 (1.8001) tag 2 (2.8001)

next mcell volume page cell 0 0 0

RECORD 0 bCnt 92 version 0 BMTR_FS_STAT (255)

st_ino 2 st_mode 40777 (S_IFDIR) st_nlink 3 st_size 8192

st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0

st_mtime Tue Sep 28 17:24:53 1999 st_umtime 191806000

st_atime Tue Sep 28 17:04:01 1999 st_uatime 0

st_ctime Tue Sep 28 17:24:53 1999 st_uctime 191806000

fragId.frag 0 fragId.type 0 BF_FRAG_ANY fragPageOffset 0

dir_tag 2 (2.8001) st_flags 0 st_unused_1 0 st_unused_2 0

25. Start by running showfsets on your AdvFS file domains so that you will know a few fileset IDs to use in the remaining exercises.

# showfsets bruden_dom

bruce_fset

Id : 37f12c39.000263ea.1.8001

Files : 6, SLim= 0, HLim= 0

Blocks (512) : 68288, SLim= 50000, HLim= 200000 grc= none

Quota Status : user=off group=off

dennis_fset

Id : 37f12c39.000263ea.2.8001

Clone is : den_clone

Files : 13, SLim= 0, HLim= 0

Blocks (512) : 92940, SLim= 0, HLim= 400000

Quota Status : user=off group=off

den_clone

Id : 37f12c39.000263ea.3.8003

Clone of : dennis_fset

Revision : 3

AdvFS On-Disk Structures 2-97

Page 162: Dunix Student

Solutions

26. Use the nvtagpg program, located in /sbin/advfs, to list the root tag files of an AdvFS file domain.

# nvtagpg -r bruden_dom

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 96 root TAG page 0

--------------------------------------------------------------------------

currPage 0

numAllocTMaps 3 numDeadTMaps 0 nextFreePage 0 nextFreeMap 5

tMapA[1] tag 1 seqNo 1 primary mcell (vol,page,cell) 1 0 1 bruce_fset

tMapA[2] tag 2 seqNo 1 primary mcell (vol,page,cell) 1 0 6 dennis_fset

tMapA[3] tag 3 seqNo 3 primary mcell (vol,page,cell) 1 0 19 den_clone

27. Select a target file and use both showfile and ls -i to obtain its tag number. The reason for using two programs is that one prints the tag number in decimal and other prints the sequence number.

Divide the tag number by 1022 and write down both the quotient and remainder. The quotient determines the page number containing the appropriate tagmap entry while the remainder determines the position within the page.

If the sequence numbers of nvtagpg and showfile don’t match, you don’t have the right tagmap entry. You may need to convert from hexadecimal to decimal to verify the match.

# showfile -x sm1

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

a.8001 2 16 15 simple ** ** async 20% sm1

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 3 2 121328 48

3 1 2 75488 16

4 3 2 98256 48

7 4 2 75360 64

11 4 2 98480 64

extentCnt: 5

#

# ls -li sm1

10 -rw-r--r-- 1 root system 124456 Sep 29 11:13 sm1

#

# nvtagpg -r bruden_dom -T 2 -t 10

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 34688 "dennis_fset" TAG page 0

--------------------------------------------------------------------------

currPage 0

numAllocTMaps 16 numDeadTMaps 0 nextFreePage 0 nextFreeMap 18

2-98 AdvFS On-Disk Structures

Page 163: Dunix Student

Solutions

tMapA[10] tag 10 seqNo 1 primary mcell (vol,page,cell) 2 0 10

#

#

# nvbmtpg -r bruden_dom 2 0 10

==========================================================================

DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0

--------------------------------------------------------------------------

CELL 10 next mcell volume page cell 0 0 0 bfSetTag,tag 2,10

RECORD 0 bCnt 92 BSR_ATTR

type BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTS

type BSXMT_APPEND chain mcell volume page cell 2 0 15

firstXtnt mcellCnt 2 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 121328 (0x1d9f0)

bsXA[ 1] bsPage 3 vdBlk -1

RECORD 2 bCnt 92 BMTR_FS_STAT

st_mode 100644 (S_IFREG) st_uid 0 st_gid 0 st_size 124456

st_nlink 1 dir_tag 2 st_mtime Wed Sep 29 11:13:22 1999

fragId.type BF_FRAG_2K fragId.frag 386

#

# ls -li sm1

10 -rw-r--r-- 1 root system 124456 Sep 29 11:13 sm1

28. Use the showfile -x command on the .tags M-file for a fileset to determine the extent map of a fileset’s tag file.

# showfile -x /usr/dennis/.tags/M1

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

1.8001 1 16 8 simple ** ** ftx 100% M1

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 8 1 34528 128

extentCnt: 1

#

#

# showfile -x /usr/dennis/.tags/M2

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

2.8001 1 16 8 simple ** ** ftx 100% M2

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 8 1 34688 128

AdvFS On-Disk Structures 2-99

Page 164: Dunix Student

Solutions

extentCnt: 1

#

# showfile -x /usr/dennis/.tags/M3

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

3.8003 1 16 8 simple ** ** ftx 50% M3

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 2 1 57712 32

2 6 1 80608 96

extentCnt: 2

#

# showfile -x /usr/dennis/.tags/M4

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

showfile: lstat failed for file ’/usr/dennis/.tags/M4’; No such file or directory

29. Create a directory and connect into it. Now look at the directory file by typing the following commands:

# vfilepg -r domain_name fileset_name directory/spec -f d

# od -A x -a -h -H . | more

Notice the entries for . and .. along with all the empty directory entries.

# ls -l | grep ^d

drwx------ 2 root system 8192 Sep 28 17:04 .tags

drwxrwxrwx 2 root system 8192 Sep 29 09:30 den_trash

drwxr-xr-x 2 root system 8192 Sep 29 11:09 testdir

#

# cd testdir

# vfilepg -r bruden_dom dennis_fset testdir -f d

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 92080 page 0

--------------------------------------------------------------------------

tag name

14 .

2 ..

15 smsub

#

#

# od -A x -a -h -H .

0000000 so nul nul nul dc4 nul soh nul . nul nul nul so nul nul nul

000e 0000 0014 0001 002e 0000 000e 0000

0000000e 00010014 0000002e 0000000e

0000010 soh nul nul nul stx nul nul nul dc4 nul stx nul . . nul nul

2-100 AdvFS On-Disk Structures

Page 165: Dunix Student

Solutions

8001 0000 0002 0000 0014 0002 2e2e 0000

00008001 00000002 00020014 00002e2e

(…)

#

#

# pwd

/usr/dennis/testdir

30. Use the touch command to create five files, i, ii, iii, iv, and v within your new directory. Use od and vfilepg to examine the directory file.

# touch i

# touch ii

# touch iii

# touch iv

# touch v

#

#

# vfilepg -r bruden_dom dennis_fset testdir -f d

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 92080 page 0

--------------------------------------------------------------------------

tag name

14 .

2 ..

15 smsub

17 i

18 ii

19 iii

20 iv

21 v

#

# od -A x -a -h -H .

0000000 so nul nul nul dc4 nul soh nul . nul nul nul so nul nul nul

000e 0000 0014 0001 002e 0000 000e 0000

0000000e 00010014 0000002e 0000000e

0000010 soh nul nul nul stx nul nul nul dc4 nul stx nul . . nul nul

8001 0000 0002 0000 0014 0002 2e2e 0000

00008001 00000002 00020014 00002e2e

0000020 stx nul nul nul soh nul nul nul si nul nul nul can nul enq nul

0002 0000 8001 0000 000f 0000 0018 0005

00000002 00008001 0000000f 00050018

0000030 s m s u b nul nul nul si nul nul nul soh nul nul nul

6d73 7573 0062 0000 000f 0000 8001 0000

75736d73 00000062 0000000f 00008001

0000040 dc1 nul nul nul dc4 nul soh nul i nul nul nul dc1 nul nul nul

0011 0000 0014 0001 0069 0000 0011 0000

00000011 00010014 00000069 00000011

0000050 soh nul nul nul dc2 nul nul nul dc4 nul stx nul i i nul nul

8001 0000 0012 0000 0014 0002 6969 0000

00008001 00000012 00020014 00006969

AdvFS On-Disk Structures 2-101

Page 166: Dunix Student

Solutions

0000060 dc2 nul nul nul soh nul nul nul dc3 nul nul nul dc4 nul etx nul

0012 0000 8001 0000 0013 0000 0014 0003

00000012 00008001 00000013 00030014

0000070 i i i nul dc3 nul nul nul soh nul nul nul dc4 nul nul nul

6969 0069 0013 0000 8001 0000 0014 0000

00696969 00000013 00008001 00000014

0000080 dc4 nul stx nul i v nul nul dc4 nul nul nul soh nul nul nul

0014 0002 7669 0000 0014 0000 8001 0000

00020014 00007669 00000014 00008001

0000090 nak nul nul nul dc4 nul soh nul v nul nul nul nak nul nul nul

0015 0000 0014 0001 0076 0000 0015 0000

00000015 00010014 00000076 00000015

(…)

#

#

# vfilepg -r bruden_dom dennis_fset -t 14

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 92080 page 0

--------------------------------------------------------------------------

000000 0e 00 00 00 14 00 01 00 2e 00 00 00 0e 00 00 00 ................

000010 01 80 00 00 02 00 00 00 14 00 02 00 2e 2e 00 00 ................

000020 02 00 00 00 01 80 00 00 0f 00 00 00 18 00 05 00 ................

000030 73 6d 73 75 62 00 00 00 0f 00 00 00 01 80 00 00 smsub...........

000040 11 00 00 00 14 00 01 00 69 00 00 00 11 00 00 00 ........i.......

000050 01 80 00 00 12 00 00 00 14 00 02 00 69 69 00 00 ............ii..

000060 12 00 00 00 01 80 00 00 13 00 00 00 14 00 03 00 ................

000070 69 69 69 00 13 00 00 00 01 80 00 00 14 00 00 00 iii.............

000080 14 00 02 00 69 76 00 00 14 00 00 00 01 80 00 00 ....iv..........

000090 15 00 00 00 14 00 01 00 76 00 00 00 15 00 00 00 ........v.......

(…)

#

#

#

# vfilepg -r bruden_dom dennis_fset -t 14 | grep iii

000070 69 69 69 00 13 00 00 00 01 80 00 00 14 00 00 00 iii.............

#

#

# vfilepg -r bruden_dom dennis_fset -t 14 | grep 070

000070 69 69 69 00 13 00 00 00 01 80 00 00 14 00 00 00 iii.............

000700 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

001070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

#

# vfilepg -r bruden_dom dennis_fset -t 14 | grep 000070

000070 69 69 69 00 13 00 00 00 01 80 00 00 14 00 00 00 iii.............

2-102 AdvFS On-Disk Structures

Page 167: Dunix Student

Solutions

31. Remove the file iii. Use od to determine what happens to the directory entry for iii.

# rm iii

#

# vfilepg -r bruden_dom dennis_fset -t 14 | grep 000070

000070 69 69 69 00 13 00 00 00 01 80 00 00 14 00 00 00 iii.............

#

32. Now remove ii. Notice how the old directory entries for ii and iii have been merged.

See solution for #31.

33. You may have noticed that the tags for ii and iii continue to reside in the directory file and may be wondering about the possibilities for file undeletion. Remember what happens to free tagmap entries. They are placed on a free list for recycling.

34. Create, via touch, a file vii and notice where its directory entry is placed.

# touch vii

#

# vfilepg -r bruden_dom dennis_fset -t 14

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 92080 page 0

--------------------------------------------------------------------------

000000 0e 00 00 00 14 00 01 00 2e 00 00 00 0e 00 00 00 ................

000010 01 80 00 00 02 00 00 00 14 00 02 00 2e 2e 00 00 ................

000020 02 00 00 00 01 80 00 00 0f 00 00 00 18 00 05 00 ................

000030 73 6d 73 75 62 00 00 00 0f 00 00 00 01 80 00 00 smsub...........

000040 11 00 00 00 14 00 01 00 69 00 00 00 11 00 00 00 ........i.......

000050 01 80 00 00 12 00 00 00 14 00 03 00 76 69 69 00 ............vii.

000060 12 00 00 00 02 80 00 00 00 00 00 00 14 00 00 00 ................

000070 69 69 69 00 13 00 00 00 01 80 00 00 14 00 00 00 iii.............

000080 14 00 02 00 69 76 00 00 14 00 00 00 01 80 00 00 ....iv..........

000090 15 00 00 00 14 00 01 00 76 00 00 00 15 00 00 00 ........v.......

0000a0 01 80 00 00 00 00 00 00 5c 01 00 00 00 00 00 00 ........\.......

(…)

#

#

# vfilepg -r bruden_dom dennis_fset testdir -f d

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 92080 page 0

--------------------------------------------------------------------------

tag name

14 .

2 ..

15 smsub

17 i

18 vii

AdvFS On-Disk Structures 2-103

Page 168: Dunix Student

Solutions

20 iv

21 v

35. Create several files with very, very long names and see how the creation of directory entries avoids crossing the sector boundaries.

#

# touch Not_so_long_but_still_pretty_longggggggggggggggggggggggggggggggggg

nggggggggggggggggggggggggggggggggggggg

gggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggg

ggggggggggggggggggggggggggggggggggggggg

#

# ls -li

total 92

19 -rw-r--r-- 1 root system 0 Sep 29 18:30

Not_so_long_but_still_pretty_longgggggggggggggggggggggggggggggggggggggggggggggggggggggg

gggggggggggggggggggggggggggggggggggggggggggggggggggggggggggg

17 -rw-r--r-- 1 root system 0 Sep 29 17:57 i

20 -rw-r--r-- 1 root system 0 Sep 29 17:57 iv

15 -rw-r--r-- 1 root system 93342 Sep 29 11:09 smsub

21 -rw-r--r-- 1 root system 0 Sep 29 17:57 v

18 -rw-r--r-- 1 root system 0 Sep 29 18:25 vii

# ls -lid

14 drwxr-xr-x 2 root system 8192 Sep 29 18:30 .

#

#

#

# ls -li

total 92

19 -rw-r--r-- 1 root system 0 Sep 29 18:31

Not_so_long_but_still_pretty_longgggggggggggggggggggggggggggggggggggggggggggggggggggggg

gggggggggggggggggggggggggggggggggggggggggggggggggggggggggggg

22 -rw-r--r-- 1 root system 0 Sep 29 18:31

Not_so_long_but_still_pretty_longgggggggggggggggggggggggggggggggggggggggggggggggggggggg

gggggggggggggggggggggggggggggggggggggggggggggggggggggggggggg1

23 -rw-r--r-- 1 root system 0 Sep 29 18:31

Not_so_long_but_still_pretty_longgggggggggggggggggggggggggggggggggggggggggggggggggggggg

gggggggggggggggggggggggggggggggggggggggggggggggggggggggggggg2

24 -rw-r--r-- 1 root system 0 Sep 29 18:31

Not_so_long_but_still_pretty_longgggggggggggggggggggggggggggggggggggggggggggggggggggggg

gggggggggggggggggggggggggggggggggggggggggggggggggggggggggggg3

17 -rw-r--r-- 1 root system 0 Sep 29 17:57 i

20 -rw-r--r-- 1 root system 0 Sep 29 17:57 iv

15 -rw-r--r-- 1 root system 93342 Sep 29 11:09 smsub

21 -rw-r--r-- 1 root system 0 Sep 29 17:57 v

18 -rw-r--r-- 1 root system 0 Sep 29 18:25 vii

#

# ls -lid

14 drwxr-xr-x 2 root system 8192 Sep 29 18:31 .

2-104 AdvFS On-Disk Structures

Page 169: Dunix Student

Solutions

36. Perform a showfile -i command on an 8K directory file. Make a larger directory file (use the script shown here if you like). What does showfile -i indicate on the larger directory?

# showfile -i /usr/dennis/testdir

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

/usr/dennis/testdir: does not have an index file

#

#

# cat make_100

#! /usr/bin/ksh

integer x=0

integer end=0

read x?"Start number? "

end=$x+100

while (( $x < $end ))

do

touch file$x

x=$x+1

done

#

# ls -lid

14 drwxr-xr-x 2 root system 8192 Sep 29 18:35 .

#

#

# ./make_100

Start number? 0

#

# ls -lid

14 drwxr-xr-x 2 root system 8192 Sep 29 18:46 .

#

# ./make_100

Start number? 100

#

# ls -lid

14 drwxr-xr-x 2 root system 8192 Sep 29 18:46 .

#

# ./make_100

Start number? 200

#

# ls -lid

AdvFS On-Disk Structures 2-105

Page 170: Dunix Student

Solutions

14 drwxr-xr-x 2 root system 16384 Sep 29 18:46 .

#

#

# showfile -i .

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

146.8001 1 16 2 simple ** ** ftx 100% index (.)

#

#

#

# pwd

/usr/dennis/testdir

37. To convince yourself that something unusual really is happening, execute the following commands:

$ dd if=/etc/disktab of=frag.file$ ls -l frag.file$ showfile -x frag.file

# dd if=/etc/disktab of=frag.file

60+1 records in

60+1 records out

#

#

# ls -li frag.file

327 -rw-r--r-- 1 root system 31114 Sep 29 18:48 frag.file

#

#

# showfile -x frag.file

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

147.8001 2 16 3 simple ** ** async 50% frag.file

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 2 2 75424 32

2 1 2 98464 16

extentCnt: 2

#

38. Execute the ls -l command on some fragment bitfiles. Remember that .tags/1 gets you to a fileset’s fragment bitfile.

# ls -li /usr/dennis/.tags/1

1 ---------- 0 root system 786432 Dec 31 1969 /usr/dennis/.tags/1

#

#

2-106 AdvFS On-Disk Structures

Page 171: Dunix Student

Solutions

# ls -li /usr/bruce/.tags/1

1 ---------- 0 root system 0 Dec 31 1969 /usr/bruce/.tags/1

#

39. Copy some randomly sized file, such as /etc/disktab, onto your AdvFS file system. Now use nvbmtpg to find out where your file’s fragment resides within the fragment bitfile.

# cp /etc/disktab /usr/dennis

#

# ls -li di*

328 -rwxr-xr-x 1 root system 31114 Sep 29 18:52 disktab

#

#

# nvbmtpg -rv bruden_dom dennis_fset -t 328

==========================================================================

DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 176 BMT page 4

--------------------------------------------------------------------------

pageId 4 megaVersion 4

freeMcellCnt 27 nextFreePg 5 nextfreeMCId page,cell 4,1

--------------------------------------------------------------------------

CELL 0 linkSegment 0 bfSetTag 2 (2.8001) tag 328 (148.8001)

next mcell volume page cell 0 0 0

RECORD 0 bCnt 92 version 0 BSR_ATTR (2)

type BSRA_VALID (3)

bfPgSz 16 transitionId 852

cloneId 0 cloneCnt 3 maxClonePgs 0

deleteWithClone 0 outOfSyncClone 0

cl.dataSafety BFD_NIL (0)

cl reqServices 1 optServices 0 extendSize 0 rsvd1 0

rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0

RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)

type BSXMT_APPEND (0)

chain mcell volume page cell 0 0 0

blksPerPage 16 segmentSize -1484947200

delLink next page,cell 56286197,22 prev page,cell 20828416,0

delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0

firstXtnt mcellCnt 1 xCnt 2

bsXA[ 0] bsPage 0 vdBlk 1105264 (0x10dd70)

bsXA[ 1] bsPage 3 vdBlk -1

RECORD 2 bCnt 92 version 0 BMTR_FS_STAT (255)

st_ino 328 st_mode 100755 (S_IFREG) st_nlink 1 st_size 31114

st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0

st_mtime Wed Sep 29 18:52:20 1999 st_umtime 360751000

st_atime Wed Sep 29 18:52:20 1999 st_uatime 355868000

st_ctime Wed Sep 29 18:52:20 1999 st_uctime 360751000

fragId.frag 8 fragId.type 7 BF_FRAG_7K fragPageOffset 3

dir_tag 2 (2.8001) st_flags 0 st_unused_1 196608 st_unused_2 0

AdvFS On-Disk Structures 2-107

Page 172: Dunix Student

Solutions

40. Now use dd to copy that fragment directly out of the fragment bitfile. You will use a command similar to:

# dd if=/playpen/.tags/1 of=/tmp/copy ibs=1024 iseek=76275 count=3

# dd if=/usr/dennis/.tags/1 of=/tmp/copy ibs=1024 iseek=8 count=1

1+0 records in

2+0 records out

#

# cat /tmp/copy

:

ra71|RA71|DEC RA71 Winchester:\

:ty=winchester:dt=MSCP:ns#51:nt#14:nc#1915:\

:oa#0:pa#131072:ba#8192:fa#1024:\

:ob#131072:pb#262144:bb#8192:fb#1024:\

:oc#0:pc#1367310:bc#8192:fc#1024:\

:od#393216:pd#324698:bd#8192:fd#1024:\

:oe#717914:pe#324698:be#8192:fe#1024:\

:of#1042612:pf#324698:bf#8192:ff#1024:\

:og#393216:pg#819200:bg#8192:fg#1024:\

:oh#1212416:ph#154894:bh#8192:fh#1024:

(…)

#

41. The program nvfragpg, found in /sbin/advfs, prints various interesting statistics about fragment usage within the eight different fragment groups. Read the reference page for this command and then apply it to each of your AdvFS filesets.

# nvfragpg -rv bruden_dom dennis_fset

==========================================================================

DOMAIN "bruden_dom"

--------------------------------------------------------------------------

frag type free 1K 2K 3K 4K 5K 6K 7K totals

groups 1 1 1 0 1 1 0 1 6

frags - 127 63 0 31 25 0 18 264

frags used - 1 1 0 1 0 0 2 5

disk space 128K 128K 128K 0K 128K 128K 0K 128K 768K

space used - 1K 2K 0K 4K 0K 0K 14K 21K

space free 127K 126K 124K 0K 120K 125K 0K 112K 734K

overhead 1K 1K 1K 0K 1K 1K 0K 1K 6K

wasted - 0K 1K 0K 3K 2K 0K 1K 7K

% used - <1% 1% 0% 3% <1% 0% 9% 2%

PAGE 0 lbn 121392 BF_FRAG_7K version 1 freeFrags 16 nextFreeGrp -1

PAGE 16 lbn 121648 BF_FRAG_5K version 1 freeFrags 25 nextFreeGrp -1

PAGE 32 lbn 121904 BF_FRAG_4K version 1 freeFrags 30 nextFreeGrp -1

PAGE 48 lbn 98560 BF_FRAG_2K version 1 freeFrags 62 nextFreeGrp -1

PAGE 64 lbn 98816 BF_FRAG_1K version 1 freeFrags 126 nextFreeGrp -1

PAGE 80 lbn 99072 BF_FRAG_ANY version 1 freeFrags 0 nextFreeGrp -1

2-108 AdvFS On-Disk Structures

Page 173: Dunix Student

Solutions

#

#

# nvfragpg -rvf bruden_dom dennis_fset

==========================================================================

DOMAIN "bruden_dom"

--------------------------------------------------------------------------

frag type free 1K 2K 3K 4K 5K 6K 7K totals

groups 1 1 1 0 1 1 0 1 6

frags - 127 63 0 31 25 0 18 264

frags used - 1 1 0 1 0 0 2 5

disk space 128K 128K 128K 0K 128K 128K 0K 128K 768K

space used - 1K 2K 0K 4K 0K 0K 14K 21K

space free 127K 126K 124K 0K 120K 125K 0K 112K 734K

overhead 1K 1K 1K 0K 1K 1K 0K 1K 6K

wasted - 0K 1K 0K 3K 2K 0K 1K 7K

% used - <1% 1% 0% 3% <1% 0% 9% 2%

head of free lists of frag groups from fileset attributes:

frag type BF_FRAG_ANY firstFreeGrp 80 lastFreeGrp 32

frag type BF_FRAG_1K firstFreeGrp 64 lastFreeGrp 64

frag type BF_FRAG_2K firstFreeGrp 48 lastFreeGrp 48

frag type BF_FRAG_3K firstFreeGrp -1 lastFreeGrp -1

frag type BF_FRAG_4K firstFreeGrp 32 lastFreeGrp 32

frag type BF_FRAG_5K firstFreeGrp 16 lastFreeGrp 16

frag type BF_FRAG_6K firstFreeGrp -1 lastFreeGrp -1

frag type BF_FRAG_7K firstFreeGrp 0 lastFreeGrp 0

BF_FRAG_ANY groups on the free list

PAGE 80 lbn 99072 BF_FRAG_ANY version 1 freeFrags 0 nextFreeGrp -1

BF_FRAG_1K groups on the free list

PAGE 64 lbn 98816 BF_FRAG_1K version 1 freeFrags 126 nextFreeGrp -1

BF_FRAG_2K groups on the free list

PAGE 48 lbn 98560 BF_FRAG_2K version 1 freeFrags 62 nextFreeGrp -1

BF_FRAG_4K groups on the free list

PAGE 32 lbn 121904 BF_FRAG_4K version 1 freeFrags 30 nextFreeGrp -1

BF_FRAG_5K groups on the free list

PAGE 16 lbn 121648 BF_FRAG_5K version 1 freeFrags 25 nextFreeGrp -1

BF_FRAG_7K groups on the free list

PAGE 0 lbn 121392 BF_FRAG_7K version 1 freeFrags 16 nextFreeGrp -1

42. Use the nvtagpg command to find the mcell IDs of some AdvFS bitfile-sets. Now use nvfragpg to print out the addresses of the fragment group headers for these filesets.

# nvtagpg -rv bruden_dom dennis_fset

==========================================================================

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 34688 "dennis_fset" TAG page 0

--------------------------------------------------------------------------

currPage 0

numAllocTMaps 328 numDeadTMaps 0 nextFreePage 0 nextFreeMap 330 padding 0

tMapA[0] tag 0 seqNo 1 (NOT USED) primary (vol,page,cell) 0 0 1

AdvFS On-Disk Structures 2-109

Page 174: Dunix Student

Solutions

tMapA[1] tag 1 seqNo 1 primary mcell (vol,page,cell) 2 0 4

tMapA[2] tag 2 seqNo 1 primary mcell (vol,page,cell) 1 0 8

tMapA[3] tag 3 seqNo 1 primary mcell (vol,page,cell) 2 0 5

tMapA[4] tag 4 seqNo 1 primary mcell (vol,page,cell) 1 0 10

tMapA[5] tag 5 seqNo 1 primary mcell (vol,page,cell) 2 0 6

tMapA[6] tag 6 seqNo 1 primary mcell (vol,page,cell) 2 0 7

tMapA[7] tag 7 seqNo 1 primary mcell (vol,page,cell) 1 0 12

tMapA[8] tag 8 seqNo 1 primary mcell (vol,page,cell) 2 0 8

tMapA[9] tag 9 seqNo 2 primary mcell (vol,page,cell) 1 0 17

tMapA[10] tag 10 seqNo 1 primary mcell (vol,page,cell) 2 0 10

tMapA[11] tag 11 seqNo 1 primary mcell (vol,page,cell) 3 0 6

tMapA[12] tag 12 seqNo 1 primary mcell (vol,page,cell) 2 0 12

tMapA[13] tag 13 seqNo 1 primary mcell (vol,page,cell) 1 0 15

tMapA[14] tag 14 seqNo 1 primary mcell (vol,page,cell) 1 0 22

tMapA[15] tag 15 seqNo 1 primary mcell (vol,page,cell) 2 0 13

tMapA[16] tag 16 seqNo 1 primary mcell (vol,page,cell) 1 0 24

(…)

tMapA[328] tag 328 seqNo 1 primary mcell (vol,page,cell) 3 4 0

tMapA[329] tag 329 seqNo 1 (NOT USED) primary (vol,page,cell) 0 10 11

tMapA[330] tag 330 seqNo 1 (NOT USED) primary (vol,page,cell) 0 10 12

tMapA[331] tag 331 seqNo 1 (NOT USED) primary (vol,page,cell) 0 10 13

(…)

tMapA[1021] tag 1021 seqNo 1 (NOT USED) primary (vol,page,cell) 0 0 0

#

#

# nvfragpg -rv bruden_dom dennis_fset

==========================================================================

DOMAIN "bruden_dom"

--------------------------------------------------------------------------

frag type free 1K 2K 3K 4K 5K 6K 7K totals

groups 1 1 1 0 1 1 0 1 6

frags - 127 63 0 31 25 0 18 264

frags used - 1 1 0 1 0 0 2 5

disk space 128K 128K 128K 0K 128K 128K 0K 128K 768K

space used - 1K 2K 0K 4K 0K 0K 14K 21K

space free 127K 126K 124K 0K 120K 125K 0K 112K 734K

overhead 1K 1K 1K 0K 1K 1K 0K 1K 6K

wasted - 0K 1K 0K 3K 2K 0K 1K 7K

% used - <1% 1% 0% 3% <1% 0% 9% 2%

PAGE 0 lbn 121392 BF_FRAG_7K version 1 freeFrags 16 nextFreeGrp -1

PAGE 16 lbn 121648 BF_FRAG_5K version 1 freeFrags 25 nextFreeGrp -1

PAGE 32 lbn 121904 BF_FRAG_4K version 1 freeFrags 30 nextFreeGrp -1

PAGE 48 lbn 98560 BF_FRAG_2K version 1 freeFrags 62 nextFreeGrp -1

PAGE 64 lbn 98816 BF_FRAG_1K version 1 freeFrags 126 nextFreeGrp -1

PAGE 80 lbn 99072 BF_FRAG_ANY version 1 freeFrags 0 nextFreeGrp -1

2-110 AdvFS On-Disk Structures

Page 175: Dunix Student

Solutions

43. The nvfragpg program, also located in /sbin/advfs, will print out a list of the free fragments found within a fragment group along with the address of the next group of that type.

# nvfragpg -rvf bruden_dom dennis_fset

==========================================================================

DOMAIN "bruden_dom"

--------------------------------------------------------------------------

frag type free 1K 2K 3K 4K 5K 6K 7K totals

groups 1 1 1 0 1 1 0 1 6

frags - 127 63 0 31 25 0 18 264

frags used - 1 1 0 1 0 0 2 5

disk space 128K 128K 128K 0K 128K 128K 0K 128K 768K

space used - 1K 2K 0K 4K 0K 0K 14K 21K

space free 127K 126K 124K 0K 120K 125K 0K 112K 734K

overhead 1K 1K 1K 0K 1K 1K 0K 1K 6K

wasted - 0K 1K 0K 3K 2K 0K 1K 7K

% used - <1% 1% 0% 3% <1% 0% 9% 2%

head of free lists of frag groups from fileset attributes:

frag type BF_FRAG_ANY firstFreeGrp 80 lastFreeGrp 32

frag type BF_FRAG_1K firstFreeGrp 64 lastFreeGrp 64

frag type BF_FRAG_2K firstFreeGrp 48 lastFreeGrp 48

frag type BF_FRAG_3K firstFreeGrp -1 lastFreeGrp -1

frag type BF_FRAG_4K firstFreeGrp 32 lastFreeGrp 32

frag type BF_FRAG_5K firstFreeGrp 16 lastFreeGrp 16

frag type BF_FRAG_6K firstFreeGrp -1 lastFreeGrp -1

frag type BF_FRAG_7K firstFreeGrp 0 lastFreeGrp 0

BF_FRAG_ANY groups on the free list

PAGE 80 lbn 99072 BF_FRAG_ANY version 1 freeFrags 0 nextFreeGrp -1

BF_FRAG_1K groups on the free list

PAGE 64 lbn 98816 BF_FRAG_1K version 1 freeFrags 126 nextFreeGrp -1

BF_FRAG_2K groups on the free list

PAGE 48 lbn 98560 BF_FRAG_2K version 1 freeFrags 62 nextFreeGrp -1

BF_FRAG_4K groups on the free list

PAGE 32 lbn 121904 BF_FRAG_4K version 1 freeFrags 30 nextFreeGrp -1

BF_FRAG_5K groups on the free list

PAGE 16 lbn 121648 BF_FRAG_5K version 1 freeFrags 25 nextFreeGrp -1

BF_FRAG_7K groups on the free list

PAGE 0 lbn 121392 BF_FRAG_7K version 1 freeFrags 16 nextFreeGrp -1

44. Start out by running od -x on one of your storage bitmap files. The command syntax will be something like:

# od -x -N 1024 /usr/.tags/-7

# od -x -N 1024 /usr/dennis/.tags/M-7

0000000 0000 0000 c700 ffff ffff ffff ffff ffff

0000020 ffff ffff ffff ffff ffff ffff ffff ffff

*

AdvFS On-Disk Structures 2-111

Page 176: Dunix Student

Solutions

0000120 ffff ffff ffff ffff 00ff 0000 0000 0000

0000140 0000 0000 0000 0000 0000 0000 0000 0000

*

0000420 0000 0000 c000 ffff ffff ffff ffff ffff

0000440 ffff ffff ffff ffff ffff ffff ffff ffff

*

0001340 07ff 0000 0000 0000 0000 0000 0000 0000

0001360 0000 0000 0000 0000 0000 0000 0000 0000

*

0020000

#

45. Repeat the previous exercise, but this time use the virtual disk interface. You must use showfile -x to find the extent map for the storage bitmap. It will look similar to:

# od -x -j 112b -N 1024 /dev/disk/dsk3c

# od -x -j 112b -N 1024 /dev/rdisk/dsk0a

0000000 0000 0000 c700 ffff ffff ffff ffff ffff

0000016 ffff ffff ffff ffff ffff ffff ffff ffff

*

0000080 ffff ffff ffff ffff 00ff 0000 0000 0000

0000096 0000 0000 0000 0000 0000 0000 0000 0000

*

0000272 0000 0000 c000 ffff ffff ffff ffff ffff

0000288 ffff ffff ffff ffff ffff ffff ffff ffff

*

0000736 07ff 0000 0000 0000 0000 0000 0000 0000

0000752 0000 0000 0000 0000 0000 0000 0000 0000

*

0008192 0002 0000 0003 0000 2c39 37f1 63ea 0002

0008208 0001 0000 0000 ffff 0000 0000 0002 0000

(...)

46. Here’s how to determine if page 17000 of an AdvFS virtual disk is free:

# expr 17000 / 8184 \* 8192 + 17000 % 8184 + 8

17024

# od -x -j 17024 -N 1 /usr/.tags/-7

47. Tru64 UNIX V5 supplies a much more convenient command, vsbmpg. Look at the reference page for this command. Accomplish the same result as Exercise 43 without all the arithmetic. Find out if page 50000 of one of your virtual disks is free.

# vsbmpg -r bruden_dom 1 -B 50000

ο

DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 112 SBM page 0

2-112 AdvFS On-Disk Structures

Page 177: Dunix Student

Solutions

--------------------------------------------------------------------------

blocks 50000 - 50015 (0xc350 - 0xc35f) are mapped by SBM map entry 97, bit 21

mapInt[97] 11111111 11111111 11111111 11111111

block 50000 ^

48. Use showfile to see the extents of your miscellaneous bitfile. Find them at .tags/-11, and so forth.

# showfile -x /usr/dennis/.tags/M-11

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

fffffff5.0000 1 16 4 simple ** ** ftx 50% M-11

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 2 1 0 32

2 2 1 64 32

extentCnt: 2

#

#

# showfile -x /usr/dennis/.tags/M-17

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

ffffffef.0000 2 16 4 simple ** ** ftx 50% M-17

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 2 2 0 32

2 2 2 64 32

extentCnt: 2

#

#

# showfile -x /usr/dennis/.tags/M-23

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

ffffffe9.0000 3 16 4 simple ** ** ftx 50% M-23

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

0 2 3 0 32

2 2 3 64 32

extentCnt: 2

AdvFS On-Disk Structures 2-113

Page 178: Dunix Student

Solutions

2-114 AdvFS On-Disk Structures

Page 179: Dunix Student

3

AdvFS In-Memory Structures

AdvFS In-Memory Structures 3-1

Page 180: Dunix Student

About This Chapter

e

About This Chapter

IntroductionThis chapter presents information about the AdvFS in-memory structures. These structures can be examined in a live system or in a crash dump. The structures can be conceptually layered just as the on-disk structures are:

• Virtual file system (VFS)

• File access subsystem (FAS)

• Bitfile access subsystem (BAS)

An understanding of these structures will provide essential information for detailed troubleshooting.

ObjectivesTo describe AdvFS in-memory structures, you should be able to:

• Describe VFS structures.

• Describe FAS layer in-memory structures.

• Describe BAS layer in-memory structures.

• List other in-memory structures:

— Free space cache

— Bitfile buffer descriptor

— I/O descriptor

ResourcesFor more information on topics in this chapter as well as related topics, see thfollowing:

• Advanced File System Administration

• AdvFS Reference Pages

• Header Files

3-2 AdvFS In-Memory Structures

Page 181: Dunix Student

Examining AdvFS In-Memory Structures

Examining AdvFS In-Memory Structures

OverviewAs discussed in the AdvFS On-Disk Structures chapter, the AdvFS file system software is arranged in a hierarchy of layers. This section looks at some of the in-memory data structures representing the on-disk data. When reading a crash, this may be the only information available.

• Overview of in-memory structures

• Big picture

Overview of In-Memory StructuresRecall that all I/O will flow through the VFS software which directs the flow to the appropriate file system specific software. The following file system layers are supported by in-memory structures:

Big Picture of Data Structure LinkageThe following figure will serve as a general reference for the linkage of the data structures to be studied.

VFS layer Vnode and mount structures

FAS layer POSIX file and fileset structures

BAS layer Bitfile and bitfile-set structures

AdvFS In-Memory Structures 3-3

Page 182: Dunix Student

Examining AdvFS In-Memory Structures

Figure 3-1: Big Picture

(dbx) func thread_block

super_task

thread thread

task

proc

utask

file table(systemwide)

vnode

bfNode

bfAccess

fsContext

Extents

rootfs mount mount

fileSetNode

bfSet

domain

vd

3-4 AdvFS In-Memory Structures

Page 183: Dunix Student

Checking the VFS Layer

Checking the VFS Layer

OverviewThe Virtual File System (VFS) acts as a director of subsequent file system activities. This software checks to see for which type of file system the I/O is destined, and directs the logic flow accordingly.

• VFS-specific structures

• vnode structure

• mount structure

VFS Specific StructuresThe VFS layer consists primarily of the following structures and fields:

File descriptor:

• Returned from the open(2) system call

• Used in the per-process utask structure

• Points (indirectly) to file structure

The following example shows an open(2) system call returning a file descriptor to the caller.

Example 3-1: Open System Call Returning a File Descriptor

int main(void){int fd, uid, pid, bytesread;

fd = open("/usr/bruden/ob_1", O_RDWR | O_CREAT, 0777);if (fd == -1)

{perror("open failed ");exit(EXIT_FAILURE);}

printf("file opened -- file descriptor is %d.\n",fd);}

File structure:

• Contains file credentials

• Contains file offset

• Points to vnode

AdvFS In-Memory Structures 3-5

Page 184: Dunix Student

Checking the VFS Layer

The vnode has a file system-specific extension leading to FAS in-memory information.

Find the file structure by using the file descriptor as an index into an array of pointers in the utask structure. This linkage has changed significantly in V5 due to support for up to 64K concurrently opened files per process. The following example shows the fields of the file structure and points out the f_data field, which will point to the vnode for the open file.

Example 3-2: Fields of the File Structure

(dbx) whatis struct filestruct file { simple_lock_data_t f_incore_lock; int f_flag; uint_t f_count; int f_type; int f_msgcount; struct ucred * f_cred; struct fileops * f_ops; caddr_t f_data; <== This field will most likely point to a vnode union { off_t fu_offset; struct file * fu_freef; } f_u; uint_t f_io_lock; int f_io_waiters;};

The following example shows a path to the utask structure which contains open file information.

Example 3-3: Using utask to Get Open File Information

(dbx) set $pid=953(dbx) p (*(struct super_task *)thread.task).utaskstruct { uu_comm = "openone" uu_maxuprc = 64 uu_logname = 0xfffffc0002fcc4a0 = "root"

(...) uu_file_state = struct {

(...) uf_entry = { [0] 0xfffffc00017bd700 <== Indirectly points to file structure [1] (nil) [2] (nil) [3] (nil) [4] (nil) [5] (nil) [6] (nil) [7] (nil) }

3-6 AdvFS In-Memory Structures

Page 185: Dunix Student

Checking the VFS Layer

Here is a route to the vnode using the file descriptor (number 3) to get to the file structure and from there to the vnode. This can be crafted into a dbx macro which expects to be given the PID of the process and the file descriptor number.

Example 3-4: Getting to the vnode

(dbx) p *(struct vnode *)(*(struct super_task *)thread.task) .utask.uu_file_state.uf_entry[0][3].ufe_ofile.f_datastruct { v_lock = 0

(...) v_type = VREG <== Regular file v_tag = VT_MSFS <== AdvFS v_mount = 0xfffffc0005ab2a80 <== To mount structure v_mountedhere = (nil) v_op = 0xfffffc00006b01d0 v_freef = (nil) v_freeb = (nil) v_mountf = 0xfffffc000367bb00 v_mountb = 0xfffffc0001c3a958 (...) v_data = "ˆ" <== Begins file system specific information

vnode StructureEvery open file is represented by a vnode. If the file is within an AdvFS file system, the vnode will have a file system-specific extension called a bfNode structure.

Characteristics of the vnode structure include:

• Contains one per opened file

• Points to mount structure

• Points to VM object

• Points to vnode switch table

• Ends with a file system specific private area extension

mount StructureEach mounted file system is represented by a mount structure. All mount structures are located from a singly linked listhead named rootfs.

Characteristics of the mount structure include:

• Contains one per active file system

• Points to root vnode

• Starts linked list of file system vnodes

• Points to VFS switch table

• Points to file system-specific private area

AdvFS In-Memory Structures 3-7

Page 186: Dunix Student

Checking the VFS Layer

The following example uses rootfs to locate a mount structure. Note that we are examining the third mount structure. The structures are linked through the m_nxt field.

Example 3-5: Viewing a mount Structure Using rootfs

(dbx) p *rootfs.m_nxt.m_nxtstruct { m_lock = 18446739675758144512 m_flag = 20480 m_funnel = 0 m_nxt = 0xfffffc0005ab2d80 <== To next mount structure m_prev = 0xfffffc0005ab3380 m_op = 0xfffffc00006af990 m_vnodecovered = 0xfffffc0001a8f200 m_mounth = 0xfffffc0004d5a240 m_vlist_lock = 0 m_exroot = 0 m_uid = 0 m_stat = struct { f_type = 10 f_flags = 20480 f_fsize = 512 f_bsize = 8192 f_blocks = 1426112 f_bfree = 400688 f_bavail = 324960 f_files = 582215 f_ffree = 558528

(...) f_mntonname = 0xfffffc0000d34c20 = "/usr" f_mntfromname = 0xfffffc0000d34940 = "usr_domain#usr" (...) msfs_args = struct { id = struct { id1 = 937059922 id2 = 653520 tag = 1 } }

(...)

3-8 AdvFS In-Memory Structures

Page 187: Dunix Student

Explaining the FAS Layer

Explaining the FAS Layer

OverviewThe FAS layer is the upper of the two (FAS and BAS) AdvFS layers. It is represented by structures for each open file and each mounted fileset. These hold AdvFS specific information while the VFS structures hold information common among file systems.

• FAS layer structures

• bfNode structure

• fsContext structure

• In-Memory Per File Structures

• fileSetNode structure

FAS Layer StructuresThe FAS layer has structures to represent the open file within the context of user files and directories and filesets. It enables access to the lower-level, bitfile-based structures. FAS layer structures include:

bfnode structure:

• AdvFS vnode

• Points to BAS layer bsAccess structure

Fileset context:

• Points to parent fileset

• fsContext structure

Fileset node:

• AdvFS private mount information

• fileSetNode structure

In-Memory Per File StructuresThe file access subsystem provides the interface to the storage system, the bitfile access subsystem. It provides per-file statistics and directories. It also provides access to symbolic links stored in bitfile metadata table entries.

AdvFS In-Memory Structures 3-9

Page 188: Dunix Student

Explaining the FAS Layer

bfNode StructureThe first private area (FAS level) of an AdvFS file is the bfNode structure. This structure has changed in V5. Most notably, the first field is now a pointer rather than an AdvFS handle.

The following is an excerpt from ms_osf.h.

Example 3-6: Fields of the bfNode Structure

/* * bfNode is the msfs structure at the end of a vnode */

typedef struct bfNode { struct bfAccess *accessp; struct fsContext *fsContextp; bfTagT tag; bfSetIdT bfSetId;} bfNodeT;

Source location: msfs/ms_osf.h

The following example uses an alias to get at the bfNode structure for an open file.

Example 3-7: Accessing bfNode Structure Using an Alias

(dbx) alias v5_get_ofile_bfNode_struct(pidd,fd) "set $pid=pidd; p *(struct bfNode *)&((struct vnode *)(*(struct super_task *)thread.task).utask.uu_file_state.uf_entry[0][fd].ufe_ofile.f_data).v_data"

(dbx)v5_get_ofile_bfNode_struct(953,3)struct { accessp = 0xfffffc0004d94d88 fsContextp = 0xfffffc0004d5ae70 tag = struct { num = 23704 seq = 32770 } bfSetId = struct { domainId = struct { tv_sec = 937059922 tv_usec = 653520 } dirTag = struct { num = 1 seq = 32769 } }}(dbx) set $bfaccess=0xfffffc0004d94d88(dbx) set $fscontext=0xfffffc0004d5ae70

3-10 AdvFS In-Memory Structures

Page 189: Dunix Student

Explaining the FAS Layer

fsContext StructureEach file will have basic ownership and permission information available in a memory-based structure. There is also reference to the directory and fileset available through the fsContext structure.

Characteristics of the fsContext structure include:

• Located through the bfNode structure

• UNIX (POSIX) information about a file (rather than the bitfile)

• Contains:

— Quota information

— Tag of fileset

— Tag of file’s parent directory

— File statistics

The following example contains excerpts from fs_dir.h. It shows the fields of the fsContext data structure. Note the pointer to the fileSetNode structure.

Example 3-8: Fields of the fsContext Structure

struct fsContext { short initialized; /* zero if fsContext is not initialized */ short quotaInitialized; /* zero if quota stuff is not initialized */ bfTagT undel_dir_tag; /* tag of undelete directory */ long fs_flag; /* flag word - see below */ int dirty_stats; /* flag for directories, says update the stats in the parent directory entry */ int dirty_alloc; /* set if stats from an allocating write are not on disk (ICHGMETA) */

lock_data_t file_lock; /* Use an OSF complex lock (read_write_lock) */

long dirstamp; /* stamp to determine directory changes */ mutexT fsContext_mutex; /* mutex to take out locks on this structure */#ifdef ADVFS_DEBUG char file_name[30]; /* first 29 chars of file name */#endif bfTagT bf_tag; /* the tag for the file */ long last_offset; /* the offset of the last found entry */ struct fs_stat dir_stats; /* stats */ struct fileSetNode *fileSetNode; /* pointer to per-fileset info */ struct dQuot *diskQuot[MAXQUOTAS]; /* pointers to quota structs */};

Source location: msfs/fs_dir.h.

The following displays the fsContext structure.

AdvFS In-Memory Structures 3-11

Page 190: Dunix Student

Explaining the FAS Layer

Example 3-9: Displaying the fsContext Structure

(dbx)p *(struct fsContext *)$fscontextstruct { initialized = 1 quotaInitialized = 1 undel_dir_tag = struct { num = 0 seq = 0 } fs_flag = 0

(...) bf_tag = struct { num = 23704 seq = 32770 } last_offset = 0 dir_stats = struct { st_ino = struct { num = 23704 seq = 32770 } st_mode = 33261 st_uid = 0 st_gid = 0 st_rdev = 0 st_size = 31114

(...) fragId = struct { frag = 58113 type = BF_FRAG_7K } st_nlink = 1 st_unused_1 = 0 fragPageOffset = 3 st_unused_2 = 0 } fileSetNode = 0xfffffc0005ab5088 diskQuot = { [0] 0xfffffc0005ac7088 [1] 0xfffffc0005ac7148 }}

The major in-memory structure providing access to bitfiles is the bfAccess structure. It also points to the extent map. The bfNode structure is the main route from the FAS structures to the BAS structures through the bsAccess structure. These relationships are shown in the following figure.

3-12 AdvFS In-Memory Structures

Page 191: Dunix Student

Explaining the FAS Layer

Figure 3-2: In-Memory Per File Structures

In-Memory Per Fileset StructuresCharacteristics of the in-memory, per fileset structures:

• The mount structure for this file system is linked to the mount structure of the root file system (m_nxt) which is found through the global symbol rootfs.

• The mounted file system’s mount structure (m_data) points to the file system- specific mount structure, in this case the AdvFS fileSetNode structure.

• The vnode of the mounted-upon directory (v_mountedhere) is set to point to the mounted file system’s mount structure to represent where the file system has been mounted.

• This mount structure (m_vnodecovered) points back to the vnode.

• Attached to the vnodes of the active files of the mounted file system are bfNode data structures, which represent the files.

DiskBlock

Extent Map

bfAccess

vnode

bfNode

fsContext

mount

fileSetNode

bfSet

domain

Per File Fileset and Domain

AdvFS In-Memory Structures 3-13

Page 192: Dunix Student

Explaining the FAS Layer

fileSetNode StructureAn AdvFS fileset is represented within the FAS layer by the fileSetNode structure. Characteristics of the fileSetNode structure include the AdvFS specific mount structure, which has pointers to:

• domain structure

• vnode for root

• mount structure

The following is an excerpt from ms_osf.h describing the fields of the fileSetNode structure.

Example 3-10: Fields of the fileSetNode Structure

typedef struct fileSetNode { struct fileSetNode *fsNext; struct fileSetNode **fsPrev; bfTagT rootTag; /* tag of root directory */ bfTagT tagsTag; /* tag of ".tags */ uint_t filesetMagic; /* magic number: structure validation */ domainT *dmnP; bfAccessT *rootAccessp; /* Access structure pointer for root */ bfSetIdT bfSetId; bfSetT *bfSetp; /* bitfile-set descriptor pointer */ struct vnode *root_vp; int fsFlags; /* flags, see below */ struct mount *mountp; /* mount table pointer */ unsigned quotaStatus; /* see definitions below */ long blkHLimit; /* maximum quota blocks in fileset */ long blkSLimit; /* soft limit for fileset blks */ long fileHLimit; /* maximum number of files in fileset */ long fileSLimit; /* soft limit for fileset files */ long blksUsed; /* number of quota blocks used */ long filesUsed; /* number of bitfiles used */ time_t blkTLimit; /* time limit for excessive disk blk use */ time_t fileTLimit; /* time limit for excessive file use */ mutexT filesetMutex; /* protect next two fields */ quotaInfoT qi[MAXQUOTAS]; fileSetStatsT fileSetStats;} fileSetNodeT;

Source location: msfs/ms_osf.h.

The following example displays the fileSetNode structure.

3-14 AdvFS In-Memory Structures

Page 193: Dunix Student

Explaining the FAS Layer

Example 3-11: Displaying the fileSetNode Structure

(dbx) p *(*(struct fsContext *)$fscontext).fileSetNodestruct { fsNext = (nil) fsPrev = 0xfffffc0005ab5348 rootTag = struct { num = 2 seq = 32769 } tagsTag = struct { num = 3 seq = 32769 } filesetMagic = 2918187013 <== 0xadf00005 dmnP = 0xfffffc0000f24008 rootAccessp = 0xfffffc0005af7688 bfSetId = struct { domainId = struct { tv_sec = 937059922 tv_usec = 653520 } dirTag = struct { num = 1 seq = 32769 } } bfSetp = 0xfffffc0005b7ca08 root_vp = 0xfffffc0005ac98c0 fsFlags = 0 mountp = 0xfffffc0005ab2a80 quotaStatus = 1421 blkHLimit = 0 blkSLimit = 0 fileHLimit = 0 fileSLimit = 0 blksUsed = 1025520 filesUsed = 23689

(...) fileSetStats = struct { msfs_lookup = 19671 lookup = struct { hit = 17494 hit_not_found = 1079 miss = 1098 } msfs_create = 2 msfs_mknod = 0

(...) }}(dbx) set $bfset=0xfffffc0005b7ca08(dbx) set $domain=0xfffffc0000f24008

AdvFS In-Memory Structures 3-15

Page 194: Dunix Student

Explaining the FAS Layer

Figure 3-3: In-Memory Per Fileset Structures

Fileset Quota StructuresSince fileset quotas are a per-fileset capability, the information describing them is held within the fileSetNode structure.

Characteristics of the fileset quota structures include:

• fileSetNode contains an array of quotaInfo structures.

Identify fileset quotas

• fileSetNode contains several limit fields.

For fields set by chfsets

Source locations:

• msfs/msfs/ms_osf.h contains defines.

• msfs/fs/fs_quota.h contains routines.

PBQ[W

URRWIV

0RXQW

6WUXFWXUH

PBLQIR

YBPRXQW

LQRGH

YBGDWD

YBPRXQW

YBPRXQWHGKHUH

LQRGH

YBGDWD

YBPRXQW

LQRGH

YBGDWD

YQRGHV

XIVPRXQW

PBQ[W 0RXQW

6WUXFWXUHPBYQRGHFRYHUHG

PBLQIR

YBPRXQW

EI1RGH

YBGDWD

YBPRXQW

EI1RGH

YBGDWD

YBPRXQW

EI1RGH

YBGDWD

YQRGHV

ILOH6HW1RGH

3-16 AdvFS In-Memory Structures

Page 195: Dunix Student

Explaining the FAS Layer

User and Group Quota StructuresSince user and group quotas are pertinent to file usage, the information describing them is in the fsContext structure.

Characteristics of the user/group quota structures include:

• fsContext points to dQuot structures.

• Kernel maintains disk quota cache:

— Table is DqHashTbl

— Access via dqget()

• msfs/msfs/fs_quota.h contains includes.

• msfs/fs/fs_quota.c contains routines.

AdvFS In-Memory Structures 3-17

Page 196: Dunix Student

Locating the BAS Layer

Locating the BAS Layer

OverviewThe lowest layer of AdvFS is the BAS layer. The files are represented at this layer by the bsAccess structures. This large structure locates the other BAS structures supporting bitfiles, bitfile-sets, domains, and volumes.

• BAS layer structure overview

• Access to BAS structures

• bfAccess structure

• Managing bfAccess structures

• bfSet structures

• Finding bfSet structures

• domain structures

• Finding domain structures

• vd structures

BAS Layer Structure OverviewBAS layer structures include:

• bfAccess structure

• bitfile structure

• One per open file

bfSet

• bitfile-set structure

• One per fileset

domainT

• File domain

• One per domain

vd structure

• Virtual disks

• One per AdvFS volume

3-18 AdvFS In-Memory Structures

Page 197: Dunix Student

Locating the BAS Layer

Access to BAS StructuresV5 of Tru64 UNIX, which supports both V3 and V4 of AdvFS, provides access to the BAS structures through pointers rather than handles. Most handles had to be divided into bit sequences to be used.

bfAccess StructureCharacteristics of the bfAccess structure include the in-memory state of a bitfile, which contains:

• Links to other bitfile access structures

• Pointer to a vnode

• Pointer to a vm object

• Highest LSN written to a log

• Pointers to next clone’s bfAccess structure (if a clone)

• Bitfile set pointer

• Domain pointer

• Primary metadata cell ID

• Volume containing primary mcell

Source location: msfs/bs_access.h

Managing bfAccess StructuresThe bfAccess structures are allocated as needed.

• Free access list

Linked list of available structures

• Closed access list

Closed and dirty bitfiles that need a bit more work before freeing

Source location: msfs/bs/bs_access.c

The following example uses the bfAccess structure to get extent information about an open file. The address of the bfAccess structure is found in the bfNode structure.

AdvFS In-Memory Structures 3-19

Page 198: Dunix Student

Locating the BAS Layer

Example 3-12: Accessing Extent Data Through bfAccess

(dbx) p *(*(struct bfAccess *)$bfaccess).xtnts.xtntMap.subXtntMap[0].bsXA[0]struct { bsPage = 0 vdBlk = 222416}(dbx) q# # # pwd/usr/bruden# showfile -x ob_1

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File 5c98.8002 1 16 3 simple ** ** async 100% ob_1

extentMap: 1 pageOff pageCnt vol volBlock blockCnt 0 3 1 222416 48 extentCnt: 1#

bfSet StructureCharacteristics of the bfSet structure include the lower-level BAS structure for a fileset. This includes:

• Pointer to the fileset’s domain

• Tag and references to this set’s tag directory

• Cloned/master state

• Fragment file information

• Back pointer to fileset node

Source location: msfs/msfs/bs_bitfile_sets.h

Finding bfSet StructuresThe elements of BfSetHashTbl indirectly point to bfSet structures.

The following macro generates a hash key value for accessing the BfSetHashTbl.

3-20 AdvFS In-Memory Structures

Page 199: Dunix Student

Locating the BAS Layer

Example 3-13: Hash Key for BfSetHashTbl

#define BFSET_GET_HASH_KEY( _bfSetId ) ( (_bfSetId).domainId.tv_sec + (_bfSetId).dirTag.num )

Source location: msfs/bs/bs_bitfile_sets.h

domain StructureThe domain structure includes domain-specific information:

• Root tag directory references

• Log location and state information

• Overall buffering state information

Source location: msfs/msfs/bs_domain.h

There is another structure domain in the networking side of UNIX. Reference the AdvFS structure with the symbol name domainT.

Finding domain StructuresSource location: msfs/bs/bs_domain.c

The following example displays the fields in the domain structure. The address of the domain structure is found in the bfAccess structure which is pointed to by the bfNode extension of the vnode.

Example 3-14: Displaying the domain Structure

(dbx) p *(domainT *)$domainstruct { mutex = struct { mutex = 0 } dmnMagic = 2918187011 dmnFwd = 0xfffffc0000f24008 dmnBwd = 0xfffffc0000f24008 dmnHashlinks = struct { dh_links = struct { dh_next = 0xfffffc0000f24008 dh_prev = 0xfffffc0000f24008 } dh_key = 937059922 } dmnVersion = 4 state = BFD_ACTIVATED domainId = struct { tv_sec = 937059922 tv_usec = 653520 } dualMountId = struct { tv_sec = 0 tv_usec = 0

AdvFS In-Memory Structures 3-21

Page 200: Dunix Student

Locating the BAS Layer

} bfDmnMntId = struct { tv_sec = 937392621 tv_usec = 874423 } dmnAccCnt = 4 dmnRefWaiters = 0 activateCnt = 2 mountCnt = 2 bfSetDirp = 0xfffffc0005b7c788 bfSetDirTag = struct { num = 4294967288 seq = 0 }

(...) bfSetHead = struct { bfsQfwd = 0xfffffc0005b7cce8 bfsQbck = 0xfffffc0005b7c7e8 } bfSetDirAccp = 0xfffffc0005af8488 ftxLogTag = struct { num = 4294967287 seq = 0 } ftxLogP = 0xfffffc0005baec48 ftxLogPgs = 512 logAccessp = 0xfffffc0005af8908

(...) domainName = "usr_domain" majorNum = 2055 flag = BFD_NORMAL lsnLock = struct { mutex = 0 } lsnList = struct { lsnFwd = 0xfffffe04075c0e68 lsnBwd = 0xfffffe04075c0e68

(...) vdCnt = 1 vdpTbl = { [0] 0xfffffc0000f2b508 [1] (nil) [2] (nil)

(...) bcStat = struct { pinHit = 14402 pinHitWait = 784 pinRead = 0 refHit = 24821 refHitWait = 69 raBuf = 2458 ubcHit = 1418 unpinCnt = struct { lazy = 14162 blocking = 66

3-22 AdvFS In-Memory Structures

Page 201: Dunix Student

Locating the BAS Layer

clean = 11 log = 2172 } derefCnt = 28516 devRead = 3025 devWrite = 3229

(...) bmtStat = struct { fStatRead = 0 fStatWrite = 5316 resv1 = 0 resv2 = 0 bmtRecRead = { [0] 0 [1] 0

(...) [21] 0 } bmtRecWrite = { [0] 0 [1] 0 [2] 97 [3] 0

(...) [21] 0 } } logStat = struct { logWrites = 313 transactions = 9860 segmentedRecs = 3 logTrims = 0 wastedWords = 27558 maxLogPgs = 102 minLogPgs = 0 maxFtxWords = 2127 maxFtxAgent = 91 maxFtxTblSlots = 15 oldFtxTblAgent = 34 excSlotWaits = 0 fullSlotWaits = 0 rsv1 = 0 rsv2 = 0 rsv3 = 0 rsv4 = 0 } totalBlks = 1426112 freeBlks = 324480 dmn_panic = 0

(...) smsync_policy = 0 metaPagep = 0xfffffe0400299008 fs_full_time = 0}

AdvFS In-Memory Structures 3-23

Page 202: Dunix Student

Locating the BAS Layer

vd StructureCharacteristics of the vd structure include per-virtual disk structure. This includes:

• Pointer to device vnode

• Pointer to RBMT, BMT, and SBM bitfiles

• Physical characteristics of device

• I/O queuing information

The following is an excerpt from bs_vd.h showing descriptions of some of the fields in the data structure.

Example 3-15: Fields in the vd Structure

/* * vd - this structure describes a virtual disk, including accessed * bitfile references, its size, i/o queues, name, id, and an * access handle for the metadata bitfile. */

typedef struct vd { /* ** Static fields (ie - they are set once and never changed). */ uint32T stgCluster; /* num blks each stg bitmap bit */ struct vnode *devVp; /* device access (temp vnode *) */ uint_t vdMagic; /* magic number: structure validation */ bfAccessT *rbmtp; /* access structure pointer for RBMT */ bfAccessT *bmtp; /* access structure pointer for BMT */ bfAccessT *sbmp; /* access structure pointer for SBM */ domainT *dmnP; /* domain pointer for ds */ uint32T vdIndex; /* 1-based virtual disk index */ uint32T maxPgSz; /* max possible page size on vd */ uint32T bmtXtntPgs; /* number of pages per BMT extent */ char vdName[BS_VD_NAME_SZ]; /* temp - should be global name */

/* The following fields are protected by the vdT.vdStateLock mutex */ bsVdStatesT vdState; /* vd state */ struct thread *vdSetupThd; /* Thread Id of the thread setting up vdT */ uint32T vdRefCnt; /* # threads actively using this volume */ uint32T vdRefWaiters; /* # threads waiting for vdRefCnt to goto 0 */ mutexT vdStateLock; /* lock for above 4 fields */

/* * The following fields are protected by the vdScLock semaphore * in the domain structure. This lock is protected by the * domain mutex. Use the macros VD_SC_LOCK and VD_SC_UNLOCK. */ uint32T vdSize; /* count of vdSectorSize blocks in vd */ int vdSectorSize; /* Sector size, in bytes, normally 512 */ uint32T vdClusters; /* num clusters in vd */ serviceClassT serviceClass; /* service class provided */

3-24 AdvFS In-Memory Structures

Page 203: Dunix Student

Locating the BAS Layer

ftxLkT mcell_lk; /* used with domain mutex */ int nextMcellPg; /* next available metadata cell’s page num */ ftxLkT rbmt_mcell_lk; /* This lock protects mcell allocation from * the rbmt mcell pool. This pool is used * to extend reserved bitfiles. */ int lastRbmtPg; /* last available reserved mcell’s page num */ int rbmtFlags; /* protected by rbmt_mcell_lk */

ftxLkT stgMap_lk; /* used with domain mutex */ stgDescT *freeStgLst; /* ptr to list of free storage descriptors */ uint32T numFreeDesc; /* number of free storage descriptors in list */ uint32T freeClust; /* total number free clusters */ uint32T scanStartClust; /* cluster where next bitmap scan will start */ uint32T bitMapPgs; /* number of pages in bitmap */ uint32T spaceReturned; /* space has been returned */ stgDescT *fill1; /* ptr to list of reserved storage descriptors */ stgDescT *fill3; /* ptr to list of free, reserved stg descriptors */ uint32T fill4; /* # of free, reserved stg descriptors in list */

ftxLkT del_list_lk; /* protects global defered delete list */

lock_data_t ddlActiveLk; /* Synchs processing of deferred-delete list entries */ /* used with domain mutex */ bfMCIdT ddlActiveWaitMCId; /* If non-nil, a thread is waiting on this entry */ /* Use domain mutex for synchronization */ cvT ddlActiveWaitCv; /* Used when waiting for active ddl entry */

struct dStat dStat; /* collect device statistics */

/* * I/O queues; these fields protected by vdIoLock */ mutexT vdIoLock; /* simple lock for guarding I/O fields. */ ioDescHdrT blockingQ; /* For blocking I/O */ ioDescHdrT waitLazyQ; /* Transactional buffers w/ too high lsn */ ioDescHdrT smSyncQ[SMSYNC_NQS]; /* smooth sync queues */ ioDescHdrT readyLazyQ; /* Sorted, ready for consolidation */ ioDescHdrT consolQ; /* Consolidated, ready to be written */ ioDescHdrT devQ; /* Tracks device */ int blockingCnt; /* keep track of how many times we can take */#define BLOCKFACT 4 int blockingFact; /* from blocking q before taking from consol q */ int rdmaxio; /* max blocks that can be read/written */ int wrmaxio; /* in a consolidated I/O */ int vdIoOut; /* There are outstanding I/O’s on this vd */ int start_active; /* Recursion preventer */ int gen_active; /* I/O generation loop active */ stateLkT active; /* indicates when disk (or lazy thread) is busy */ short advfs_start_more_posted; /* 0 = no message yet posted */ /* 1 = message posted but not processed */ /* 2 = vd_free in progress */

AdvFS In-Memory Structures 3-25

Page 204: Dunix Student

Locating the BAS Layer

#ifdef ADVFS_DEBUG enum deFlags errorFlag; int errorCount; int errorRepeat;#endif /* ADVFS_DEBUG */

u_long blkQ_cnt; /* count of bufs placed onto blockingQ */ u_long lazyQ_cnt; /* count of bufs placed onto lazyQ */ u_long smsyncQ_cnt; /* count of bufs placed onto smsyncQ */ u_long readyQ_cnt; /* count of bufs placed onto readyQ */ u_long consolQ_cnt; /* count of bufs placed onto consQ */ u_long devQ_cnt; /* count of bufs placed onto devQ */ u_long rmioq_cnt; /* count of bufs rm_ioq’ed */ u_long rmormvq_cnt; /* count of bufs rm_or_moveq’ed */ u_int syncQIndx; /* next smsync queue to be processed */ /* end of fields protected by vdIoLock */

int consolidate; /* Flag, one indicates disk can take big io’s */ int max_iosize_rd; /* From device */ int max_iosize_wr; /* From device */ int preferred_iosize_rd; /* From device */ int preferred_iosize_wr; /* From device */ int qtodev; /* max number of I/O’s to be queued to dev */

stgDescT freeRsvdStg; /* desc for free rsvd stg for rsvd files */#ifdef ADVFS_VD_TRACE uint32T trace_ptr; vdTraceElmtT trace_buf[VD_TRACE_HISTORY];#endif} vdT;

3-26 AdvFS In-Memory Structures

Page 205: Dunix Student

Defining Other In-Memory Structures

Defining Other In-Memory Structures

Free Space CacheCharacteristics of the free space cache include per-volume, in-core structure (structure stgDesc).

• Linked list of contiguous free clusters

• Each entry gives the starting block and size of each free area

To avoid costly I/Os and bitmap scanning when searching for free space on a volume, AdvFS uses an in-memory free space cache to keep track of free space on a volume. The cache is a linked list of free space extents. It is a cache because it has a limited number of entries so it does not represent all the free space on a volume. The entries in the cache are sorted by cluster number.

The free space cache is filled whenever it becomes empty (or when it is explicitly invalidated). It is filled by scanning the bitmap and creating cache entries for free space extents in the bitmap.

The following example describes the storage descriptor structure found in msfs/msfs/bs_vd.h. This structure uses data drawn from the SBM.

Example 3-16: Fields in the stgDesc Structure

/* * stgDescT - Describes a contiguous set of available (free) vd blocks. * These structures are used to maintain a list of free disk space. There * is a free list in each vd structure. The list is ordered by virtual * disk block (it could also be ordered by the size of each contiguous * set of blocks in the future). Refer to the "sbm_" routines in * bs_sbm.c. */

typedef struct stgDesc { uint32T start_clust; /* vd cluster number of first free cluster */ uint32T num_clust; /* number of free clusters */ struct stgDesc *prevp; struct stgDesc *nextp;} stgDescT;

Bitfile Buffer DescriptorCharacteristics of the bitfile buffer descriptor include:

• Descriptor for bitfile pages

• Structure bsBuf

— Lots of fields for doubly linked lists

— Log record addresses

AdvFS In-Memory Structures 3-27

Page 206: Dunix Student

Defining Other In-Memory Structures

ge

— Page address

Domain, bitfile-set, fileset, page

— Physical location

— bfAccess structure

— I/O descriptor information and queues

Migrating pages may have more than one I/O descriptor.

This structure gives information about the in-core information of an AdvFS pastored in the primary cache. Pinned pages may be found here.

Source location: msfs/msfs/bs_buf.h

I/O DescriptorCharacteristics of the I/O descriptor include:

• Links for the I/O queues

• Block descriptor

— Virtual disk

— Block

• Address of buffer

• Pointer to bsBuf structure

The source for the structure ioDesc is found in msfs/msfs/bs_ims.h.

FTX State StructureCharacteristics of the FTX state structure include:

• Structure ftx or ftxStateT

• Fields include log record numbers:

— First and last written

— Undo back link

Source location: msfs/msfs/ftx_privates.h

3-28 AdvFS In-Memory Structures

Page 207: Dunix Student

Summary

Summary

Examining AdvFS In-Memory StructuresRecall that all I/O will flow through the VFS software which directs the flow to the appropriate file system-specific software. The following file system layers are supported by in-memory structures.

Checking the VFS LayerThe virtual file system acts as a director of subsequent file system activities. This software checks to see for which type of file system the I/O is destined, and directs the logic flow accordingly.

• VFS specific structures

• The vnode structure

• The mount structure

Explaining the FAS LayerThe FAS layer in-memory structures include those shown here.

Locating the BAS LayerThese BAS layer in-memory structures include those shown here.

VFS layer Vnode and mount structures

FAS layer POSIX file and fileset structures

BAS layer Bitfile and bitfile-set structures

bfNode Bitfile node pointer

fsContext Fileset context (points to parent fileset)

fileSetNode Fileset node (AdvFS private mount information)

dQuot Disk quota cache

bfAccess Provides access to bitfiles

bfSet Bitfile-set structure

domainT File domain

vd Virtual disks

AdvFS In-Memory Structures 3-29

Page 208: Dunix Student

Summary

Defining Other In-Memory StructuresTo avoid costly I/Os and bitmap scanning when searching for free space on a volume, AdvFS uses an in-memory free space cache to keep track of free space on a volume. The cache is a linked list of free space extents.

The bitfile buffer descriptor gives information about the in-core information of an AdvFS page stored in the primary cache. Pinned pages may be found here.

I/O descriptor contains links for the I/O queues.

FTX state structure ftx describes transactions for logging.

3-30 AdvFS In-Memory Structures

Page 209: Dunix Student

Exercises

Exercises

Start a process that opens an AdvFS file but does not close it. If you are a C or Korn shell user, start a cat process and press ^Z to suspend it (or use the more command).

% cd some-advfs-directory% cat > testy.fileHere is one line.^Z

Note the inode number of the file you have created and the ID of the process writing the file.

1. Determine the address of the file structure associated with the file. Use the ofile command of kdbx or use the dbx techniques shown in class. If you followed the suggestion above when creating the file, the address of the file structure will be found in file descriptor 1, standard output.

# kdbx -k /vmunix ....(kdbx) ofile -pid 19279

Proc=0xfffffc0002590ca0 pid=19279 ofile[ 0]=0xfffffc0003269680 ofile[ 1]=0xfffffc0003269400 ofile[ 2]=0xfffffc0003269680

2. Now print the file structure. Examine its values to make sure they seem reasonable. You may continue to use kdbx, however, if you encounter problems, try dbx. It does not crash as frequently.

# dbx -k /vmunix .......(dbx) set $f = (struct file *)0xfffffc0003269400(dbx) print *(struct file *)$f

3. The f_data field of the file structure points to the file’s vnode. Save and print the address of the vnode. It’s useful to print the address so you’ll have it handy in case dbx crashes. Now print the vnode structure.

(dbx) set $v=(struct vnode *)(((struct file *)$f)->f_data)(dbx) p $v0xfffffc0002964e00 (dbx) print *(struct vnode *)$v

You should not have to type the (struct vnode *) type case.

Feel free to use a supplied alias, or create your own.

AdvFS In-Memory Structures 3-31

Page 210: Dunix Student

Exercises

4. The bfNode and fileset context structures are in the private area of the vnode. Set a pointer to the bfNode and print its contents. The bfNode is an extension to the vnode and contains a pointer to the fsContext structure.

(dbx) set $bf=(struct bfNode *)&(((struct vnode *)$v)->v_data)(dbx) p $bf0xfffffc0002964eb8 (dbx) p *(struct bfNode *)$bf

Note the access and fileset context pointer for your file.

5. Verify that you are looking at the right file by matching the tag number, as shown by showfile, with the tag you see with dbx.

6. Obtain a pointer to the fileset context and print it.

(dbx) set $fc=(struct fsContext *)(($bf)->fsContextp)(dbx) p $fc0xfffffc0002964ee0(dbx) p *(struct fsContext *)$fcstruct { initialized = 1....... dir_stats = struct { st_ino = struct { num = 13104 seq = 32790 }....... fileSetNode = 0xfffffc0003f26288 diskQuot = { [0] 0xfffffc00014bc988 [1] 0xfffffc00014bca08 }}

Note how information contained in several BMT records related to the file has been placed into one in-memory structure. Verify that the POSIX file statistics information seems reasonable.

7. Print out the two disk quota structures.

(dbx) set $qu=(struct dQuot *)(($fc)->diskQuot[0])(dbx) p *(struct dQuot *)$qu.......(dbx) set $qg=(struct dQuot *)(($fc)->diskQuot[1])(dbx) p *(struct dQuot *)$qg

At this point you have looked at the in-core, FAS-level structures for the file. Now look at the FAS-level structure for the file system or fileset. You could go directly from the fileset context structure, but let’s take the scenic tour through the mount structure.

3-32 AdvFS In-Memory Structures

Page 211: Dunix Student

Exercises

8. There is a pointer to the mount structure inside the vnode. Print the structure. Wake up when you see the few MSFS-specific fields.

(dbx) set $m=(struct mount *)(((struct vnode *)$v)->v_mount)(dbx) print $m(dbx) print *(struct mount *)$m

9. The m_info field of the mount structure contains a pointer to AdvFS private file system information. This information is the fileSetNode. Print it.

(dbx) set $fsn=(struct fileSetNode *)(((struct mount *)$m)->m_info)(dbx) px $fsn0xfffffc0003f26288 (dbx) p *(struct fileSetNode *)$fsnstruct {........ domainId = struct { tv_sec = 865089685 tv_usec = 897520 }....... bfSetH = struct { setH = 4 dmnH = 2 } root_vp = 0xfffffc0001451200....... fileSetStats = struct { msfs_lookup = 8721216.......

You will see all sorts of interesting information: domain ID, bitfile-set handle, pointer to the file system’s root directory, and lots of statistics.

10. Use showfdmn to verify you have a domain ID match.

The significant FAS-level structures have now been studied.

11. Print the access structure and see if its tag number matches your target.

(dbx) p *(struct bfAccess *)$bfastruct { fwd = 0xffffffff8077a3a8 bwd = 0xfffffc0000601c90....... tag = struct { num = 12572 seq = 32797 }.......

AdvFS In-Memory Structures 3-33

Page 212: Dunix Student

Exercises

12. Examine the back pointers to the vnode and VM object. Use these fields for additional confirmation that you have the right target. You will also see pointers to extent map information and the bitfile-set and domain structures.

(dbx) p *(struct bfAccess *)$bat

13. Print the extent map.

For most bitfiles this will not be a very exciting structure; however, for bitfiles with many extents, this is the beginning of a mass of pointers. Note that the subXtntMap field is an array with validCnt elements. Each subXtntMap structure has an array of extents (bsXA) with cnt elements.

We can now proceed to the bitfile-set, domain, and virtual disk data structures of AdvFS. Use the pointers of the bitfile access structure to find them.

14. Move from the bitfile access structure into the bitfile-set structure. Print it. There is a lot to see. Be sure to use the bitfile-set ID field to verify and match the values returned by showfsets.

(dbx) set $bfs=(bfSetT *)(((struct bfAccess *)$bfa)->bfSetp)(dbx) p $bfs(dbx) p *(bfSetT *)$bfsstruct { bfSetId = struct { domainId = struct { tv_sec = 864927707 tv_usec = 860832 } dirTag = struct { num = 1 seq = 32769 } }....... dmnp = 0xfffffc000123e008

15. Now print the domain structure. (Do not use struct domain unless you want the structure for socket domains.) In the middle of this structure, you will see an array for pointers to virtual disk structures. There are also many fields used to control file domain I/O.

(dbx) set $d=(domainT *)(((bfSetT *)$bfs)->dmnp)(dbx) p $d0xfffffc000123e008 (dbx) p *(domainT *)$dstruct {....... domainName = "usr_domain"....... vdpTbl = { [0] 0xfffffc0003a18388.......

3-34 AdvFS In-Memory Structures

Page 213: Dunix Student

Exercises

16. The last major structure to print is used for virtual disks. You will see even more I/O control substructures here.

(dbx) set $vd=(struct vd *)(((domainT *)$d)->vdpTbl[0])(dbx) p $vd0xfffffc0003a18388 (dbx) p *(struct vd *)$vdstruct {....... vdName = "/dev/disk/dsk2g"........ freeStgLst = 0xfffffc0003fcf188

17. For the finale, print the a data structure of the free storage cache.

(dbx) set $fst=(stgDescT *)(((struct vd *)$vd)->freeStgLst)(dbx) p $fst0xfffffc0003fcf188 (dbx) p *(stgDescT *)$fststruct { start_clust = 705832 num_clust = 14360 prevp = 0xfffffc0003fcf188 nextp = 0xfffffc0003fcf188}

AdvFS In-Memory Structures 3-35

Page 214: Dunix Student

Solutions

Solutions

Start a process that opens an AdvFS file but does not close it. If you are a C or Korn shell user, start a cat process and press ^Z to suspend it (or just use more).

% cd some-advfs-directory% cat > testy.fileHere is one line.^Z

Note the inode number of the file you have created and the ID of the process writing the file.#

# cat openone.c

/* openone.c I */

/* SAMPLE PROGRAM FOR FILE OPEN TESTING */

/* Opens or creates a file named ob_1 in the current directory. */

#include <sys/file.h>

#include <stdlib.h>

int main(void)

{

int fd, uid, pid, bytesread;

fd = open("ob_1", O_RDWR | O_CREAT, 0777);

if (fd == -1)

{

perror("open failed ");

exit(EXIT_FAILURE);

}

printf("file opened -- file descriptor is %d.\n",fd);

uid = getuid();

printf("uid is %d.\n",uid);

pid = getpid();

printf("pid is %d.\n",pid);

printf("Hit any key to close file and terminate program.\n");

getchar();

close(fd);

printf("done\n");

}

#

# cc -o openone openone.c

#

# ./openone&

[1] 816

3-36 AdvFS In-Memory Structures

Page 215: Dunix Student

Solutions

# file opened -- file descriptor is 3.

uid is 0.

pid is 816.

Hit any key to close file and terminate program.

#

[1] + Stopped(SIGTTIN) ./openone&

#

# ps | grep open

816 pts/5 T N 0:00.02 ./openone

818 pts/5 U + 0:00.01 grep open

#

#

# ls -li ob_1

23723 -rwxr-xr-x 1 root system 0 Sep 30 09:01 ob_1

1. Determine the address of the file structure associated with the file. Use the ofile command of kdbx or use the dbx techniques shown in class. If you followed the suggestion above when creating the file, the address of the file structure will be found in file descriptor 1, standard output.

# kdbx -k /vmunix ....(kdbx) ofile -pid 19279

Proc=0xfffffc0002590ca0 pid=19279 ofile[ 0]=0xfffffc0003269680 ofile[ 1]=0xfffffc0003269400 ofile[ 2]=0xfffffc0003269680

# kdbx -k /vmunix

dbx version 5.0

Type ’help’ for help.

stopped at [thread_block:2709 ,0xfffffc00002c8084] Source not available

warning: Files compiled -g3: parameter values probably wrong

(kdbx)

(kdbx)

(kdbx)

(kdbx) ofile -pid 816

Proc=0xfffffc00059b2c80 pid= 816

ofile[ 0]=0xfffffc000248e040

ofile[ 1]=0xfffffc000248e040

ofile[ 2]=0xfffffc000248e040

ofile[ 3]=0xfffffc0002eef500

(kdbx)

(kdbx)

(kdbx) q

dbx (pid 832) died. Exiting...

#

Through dbx

AdvFS In-Memory Structures 3-37

Page 216: Dunix Student

Solutions

#

# dbx -k /vmunix

dbx version 5.0

Type ’help’ for help.

stopped at [thread_block:2709 ,0xfffffc00002c8084] Source not available

warning: Files compiled -g3: parameter values probably wrong

(dbx)

(dbx)

(dbx)

(dbx) set $pid=816

(dbx)

(dbx) p (*(struct super_task

*)thread.task).utask.uu_file_state.uf_entry[0][3].ufe_ofile

0xfffffc0002eef500

2. Now print the file structure. Examine its values to make sure they seem reasonable. You can continue to use kdbx, however, if it starts giving you trouble, remember that dbx does not crash as frequently.

# dbx -k /vmunix .......(dbx) set $f = (struct file *)0xfffffc0003269400(dbx) print *(struct file *)$f

(dbx) set $f = 0xfffffc0002eef500

(dbx)

(dbx) px $f

0xfffffc0002eef500

(dbx)

(dbx)

(dbx) p *(struct file *)$f

struct {

f_incore_lock = 0

f_flag = 3

f_count = 1

f_type = 1

f_msgcount = 0

f_cred = 0xfffffc000346c3c0

f_ops = 0xfffffc00006c5990

f_data = 0xfffffc00020ec000 = ""

f_u = union {

fu_offset = 0

fu_freef = (nil)

}

f_io_lock = 0

f_io_waiters = 0

}

3. The f_data field of the file structure points to the file’s vnode. Save and print the address of the vnode. It is useful to print the address so you will have it handy in case dbx crashes. Now print the vnode structure.

(dbx) set $v=(struct vnode *)(((struct file *)$f)->f_data)(dbx) p $v

3-38 AdvFS In-Memory Structures

Page 217: Dunix Student

Solutions

0xfffffc0002964e00 (dbx) print *(struct vnode *)$p

You should not have to type the (struct vnode *) type case. Feel free to use a supplied macro or create your own.

(dbx) set $v = 0xfffffc00020ec000

(dbx)

(dbx) p *(struct vnode *)$v

struct {

v_lock = 0

v_flag = 0

v_usecount = 1

v_aux_lockers = 0

v_shlockc = 0

v_exlockc = 0

v_lastr = 0

v_id = 29695

v_type = VREG

v_tag = VT_MSFS

v_mount = 0xfffffc0005ab2a80

v_mountedhere = (nil)

v_op = 0xfffffc00006b01d0

v_freef = (nil)

v_freeb = (nil)

v_mountf = 0xfffffc0002218900

v_mountb = 0xfffffc0004ad4058

v_buflists_lock = 0

v_cleanblkhd = (nil)

v_dirtyblkhd = (nil)

v_ncache_time = 1765

v_free_time = 1286

v_output_lock = 0

v_numoutput = 0

v_outflag = 0

v_cache_lookup_refs = 0

v_rdcnt = 1

v_wrcnt = 1

v_dirtyblkcnt = 0

v_dirtyblkpush = 0

v_un = union {

vu_socket = (nil)

vu_specinfo = (nil)

vu_fifonode = (nil)

}

v_object = 0xfffffc00027f9380

v_secattr = (nil)

v_data = "

}

(dbx)

(dbx)

(dbx) alias v5_get_ofile_vnode_struct(pidd,fd) "set $pid=pidd;p *(struct vnode

*)(*(struct super_task

*)thread.task).utask.uu_file_state.uf_entry[0][fd].ufe_ofile.f_data"

AdvFS In-Memory Structures 3-39

Page 218: Dunix Student

Solutions

(dbx)

(dbx)

(dbx) v5_get_ofile_vnode_struct(816,3)

struct {

v_lock = 0

v_flag = 0

v_usecount = 1

v_aux_lockers = 0

v_shlockc = 0

v_exlockc = 0

v_lastr = 0

v_id = 29695

v_type = VREG

v_tag = VT_MSFS

v_mount = 0xfffffc0005ab2a80

v_mountedhere = (nil)

v_op = 0xfffffc00006b01d0

v_freef = (nil)

v_freeb = (nil)

v_mountf = 0xfffffc0002218900

v_mountb = 0xfffffc0004ad4058

v_buflists_lock = 0

v_cleanblkhd = (nil)

v_dirtyblkhd = (nil)

v_ncache_time = 1765

v_free_time = 1286

v_output_lock = 0

v_numoutput = 0

v_outflag = 0

v_cache_lookup_refs = 0

v_rdcnt = 1

v_wrcnt = 1

v_dirtyblkcnt = 0

v_dirtyblkpush = 0

v_un = union {

vu_socket = (nil)

vu_specinfo = (nil)

vu_fifonode = (nil)

}

v_object = 0xfffffc00027f9380

v_secattr = (nil)

v_data = "

}

4. The bfNode and fileset context structures are in the private area of the vnode. Set a pointer to the bfNode and print its contents. The bfNode is an extension to the vnode and contains a pointer to the fsContext structure.

(dbx) set $bf=(struct bfNode *)&(((struct vnode *)$v)->v_data)(dbx) p $bf0xfffffc0002964eb8 (dbx) p *(struct bfNode *)$bf

Note the access and fileset context pointer for your file.

3-40 AdvFS In-Memory Structures

Page 219: Dunix Student

Solutions

(dbx) px $v

0xfffffc00020ec000

(dbx)

(dbx)

(dbx) p &(*(struct vnode *)$v).v_data

0xfffffc00020ec0c8 = "^H^;\256^E"

(dbx)

(dbx)

(dbx) set $bf=0xfffffc00020ec0c8

(dbx)

(dbx)

(dbx) p *(struct bfNode *)$bf

struct {

accessp = 0xfffffc0005aefb08

fsContextp = 0xfffffc00020ec0f0

tag = struct {

num = 23723

seq = 32771

}

bfSetId = struct {

domainId = struct {

tv_sec = 937059922

tv_usec = 653520

}

dirTag = struct {

num = 1

seq = 32769

}

}

}

(dbx)

(dbx)

(dbx) whatis struct bfNode

struct bfNode {

struct bfAccess * accessp;

struct fsContext * fsContextp;

bfTagT tag;

bfSetIdT bfSetId;

};

(dbx)

(dbx)

(dbx) set $fsc = 0xfffffc00020ec0f0

(dbx)

(dbx)

(dbx) p *(struct fsContext *)$fsc

struct {

initialized = 1

quotaInitialized = 1

undel_dir_tag = struct {

num = 0

seq = 0

}

fs_flag = 0

dirty_stats = 0

AdvFS In-Memory Structures 3-41

Page 220: Dunix Student

Solutions

dirty_alloc = 0

file_lock = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc00020ec118

prev = 0xfffffc00020ec118

}

l_caller = 0

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = (nil)

}

dirstamp = 0

fsContext_mutex = struct {

mutex = 0

}

bf_tag = struct {

num = 23723

seq = 32771

}

last_offset = 0

dir_stats = struct {

st_ino = struct {

num = 23723

seq = 32771

}

st_mode = 33261

st_uid = 0

st_gid = 0

st_rdev = 0

st_size = 0

st_atime = 938696472

st_uatime = 976962000

st_mtime = 938696472

st_umtime = 976962000

st_ctime = 938696472

st_uctime = 976962000

st_flags = 0

dir_tag = struct {

num = 23707

seq = 32769

}

fragId = struct {

frag = 0

type = BF_FRAG_ANY

}

st_nlink = 1

st_unused_1 = 0

fragPageOffset = 0

st_unused_2 = 0

}

3-42 AdvFS In-Memory Structures

Page 221: Dunix Student

Solutions

fileSetNode = 0xfffffc0005ab5088

diskQuot = {

[0] 0xfffffc0005ac7088

[1] 0xfffffc0005ac7148

}

}

5. Verify that you are looking at the right file by matching the tag number, as shown by showfile, with the tag you see with dbx.

(dbx) sh showfile -x ob_1

Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File

5cab.8003 1 16 0 simple ** ** async 100% ob_1

extentMap: 1

pageOff pageCnt vol volBlock blockCnt

extentCnt: 0

(dbx)

(dbx)

(dbx) sh ls -li ob_1

23723 -rwxr-xr-x 1 root system 0 Sep 30 09:01 ob_1

(dbx)

6. Obtain a pointer to the fileset context and print it.

(dbx) set $fc=(struct fsContext *)(((struct bfNode *)$bf)->fsContextp)(dbx) p $fc0xfffffc0002964ee0(dbx) p *(struct fsContext *)$fcstruct { initialized = 1....... dir_stats = struct { st_ino = struct { num = 13104 seq = 32790 }....... fileSetNode = 0xfffffc0003f26288 diskQuot = { [0] 0xfffffc00014bc988 [1] 0xfffffc00014bca08 }}

Note how information contained in several BMT records related to the file has been placed into one in-memory structure. Verify that the POSIX file statistics information seems reasonable. (Extracted from the solution for #4.)

(dbx) dir_stats = struct {

st_ino = struct {

num = 23723

seq = 32771

}

AdvFS In-Memory Structures 3-43

Page 222: Dunix Student

Solutions

st_mode = 33261

st_uid = 0

st_gid = 0

st_rdev = 0

st_size = 0

st_atime = 938696472

st_uatime = 976962000

st_mtime = 938696472

st_umtime = 976962000

st_ctime = 938696472

st_uctime = 976962000

st_flags = 0

dir_tag = struct {

num = 23707

(dbx)

(dbx)

(dbx) po 33261

0100755

(dbx)

(dbx)

(dbx)

(dbx) sh ls -li ob_1

23723 -rwxr-xr-x 1 root system 0 Sep 30 09:01 ob_1

(dbx)

(dbx)

(dbx) sh ls -lid /usr/bruden/advfs

23707 drwxr-xr-x 2 root system 8192 Sep 30 09:01 /usr/bruden/advfs

(dbx)

7. Print out the two disk quota structures.

(dbx) set $qu=(struct dQuot *)(((struct fsContext *)$fc)->diskQuot[0])(dbx) p *(struct dQuot *)$qu.......(dbx) set $qg=(struct dQuot *)(((struct fsContext *)$fc)->diskQuot[1])(dbx) p *(struct dQuot *)$qg

(dbx) whatis struct fsContext

struct fsContext {

short initialized;

short quotaInitialized;

bfTagT undel_dir_tag;

long fs_flag;

int dirty_stats;

int dirty_alloc;

lock_data_t file_lock;

long dirstamp;

mutexT fsContext_mutex;

bfTagT bf_tag;

long last_offset;

struct fs_stat {

bfTagT st_ino;

mode_t st_mode;

uid_t st_uid;

3-44 AdvFS In-Memory Structures

Page 223: Dunix Student

Solutions

gid_t st_gid;

dev_t st_rdev;

off_t st_size;

time_t st_atime;

int st_uatime;

time_t st_mtime;

int st_umtime;

time_t st_ctime;

int st_uctime;

uint_t st_flags;

bfTagT dir_tag;

bfFragIdT fragId;

u_short st_nlink;

short st_unused_1;

uint32T fragPageOffset;

uint32T st_unused_2;

} dir_stats;

struct fileSetNode * fileSetNode;

diskQuot[2] of struct dQuot *;

};

(dbx)

(dbx)

(dbx) whatis struct dQuot

struct dQuot {

dyn_hashlinks_w_keyT dq_links;

int dq_flags;

int dq_type;

int dq_cnt;

uint_t dq_id;

union {

struct dQBlk32 {

u_int dqb_bhardlimit;

u_int dqb_bsoftlimit;

u_int dqb_curblocks;

u_int dqb_ihardlimit;

u_int dqb_isoftlimit;

u_int dqb_curinodes;

time_t dqb_btime;

time_t dqb_itime;

} dq_dqb32;

struct dQBlk64 {

u_long dqb_bhardlimit;

u_long dqb_bsoftlimit;

u_long dqb_curblocks;

u_int dqb_ihardlimit;

u_int dqb_isoftlimit;

u_int dqb_curinodes;

u_int dqb_unused1;

time_t dqb_btime;

u_int dqb_unused2;

time_t dqb_itime;

u_int dqb_unused3;

u_long dqb_unused4;

} dq_dqb64;

AdvFS In-Memory Structures 3-45

Page 224: Dunix Student

Solutions

} dQ;

struct fileSetNode * fileSetNode;

lock_data_t dqLock;

};

(dbx)

(dbx)

(dbx) p *(*(struct fsContext *)$fsc).diskQuot[0]

struct {

dq_links = struct {

dh_links = struct {

dh_next = 0xfffffc0005ac7088

dh_prev = 0xfffffc0005ac7088

}

dh_key = 36028788429215309

}

dq_flags = 8

dq_type = 0

dq_cnt = 203

dq_id = 0

dQ = union {

dq_dqb32 = struct {

dqb_bhardlimit = 0

dqb_bsoftlimit = 0

dqb_curblocks = 0

dqb_ihardlimit = 0

dqb_isoftlimit = 203192

dqb_curinodes = 0

dqb_btime = 0

dqb_itime = 0

}

dq_dqb64 = struct {

dqb_bhardlimit = 0

dqb_bsoftlimit = 0

dqb_curblocks = 203192

dqb_ihardlimit = 0

dqb_isoftlimit = 0

dqb_curinodes = 7477

dqb_unused1 = 0

dqb_btime = 0

dqb_unused2 = 0

dqb_itime = 0

dqb_unused3 = 0

dqb_unused4 = 0

}

}

fileSetNode = 0xfffffc0005ab5088

dqLock = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0005ac7100

prev = 0xfffffc0005ac7100

}

l_caller = 3682780

l_wait_writers = 0

3-46 AdvFS In-Memory Structures

Page 225: Dunix Student

Solutions

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = 0xfffffc0001c93500

}

}

At this point we have seen the in-core, FAS-level structures for the file. Now look at the FAS-level structure for the file system or fileset. You could go directly from the fileset context structure, but take the scenic tour through the mount structure.

8. There is a pointer to the mount structure inside the vnode. Print the structure. Wake up when you see the few MSFS-specific fields.

(dbx) set $m=(struct mount *)(((struct vnode *)$v)->v_mount)(dbx) print $m(dbx) print *(struct mount *)$m

(dbx) p (*(struct vnode *)$v).v_mount

0xfffffc0005ab2a80

(dbx)

(dbx)

(dbx) set $m = 0xfffffc0005ab2a80

(dbx)

(dbx) p *(struct mount *)$m

struct {

m_lock = 18446739675758144512

m_flag = 20480

m_funnel = 0

m_nxt = 0xfffffc0005ab2d80

m_prev = 0xfffffc0005ab3380

m_op = 0xfffffc00006af990

m_vnodecovered = 0xfffffc0001987200

m_mounth = 0xfffffc00011c0fc0

m_vlist_lock = 0

m_exroot = 0

m_uid = 0

m_stat = struct {

f_type = 10

f_flags = 20480

f_fsize = 512

f_bsize = 8192

f_blocks = 1426112

f_bfree = 399740

f_bavail = 204992

f_files = 377869

f_ffree = 354165

f_fsid = struct {

val = {

[0] 3776054149

[1] 10

}

}

f_spare = {

AdvFS In-Memory Structures 3-47

Page 226: Dunix Student

Solutions

[0] 0

[1] 0

[2] 0

}

f_mntonname = 0xfffffc0000d34c90 = "/usr"

f_mntfromname = 0xfffffc0000d34940 = "usr_domain#usr"

mount_info = union {

ufs_args = struct {

fspec = 0x9f8d037da6652

exflags = 1

exroot = 0

}

nfs_args = struct {

addr = 0x9f8d037da6652

fh = 0x1

flags = 0

wsize = 0

rsize = 0

timeo = 0

retrans = 0

maxtimo = 0

hostname = (nil)

acregmin = 0

acregmax = 0

acdirmin = 0

acdirmax = 0

netname = (nil)

pathconf = (nil)

}

mfs_args = struct {

name = 0x9f8d037da6652

base = 0x1

size = 0

}

cdfs_args = struct {

fspec = 0x9f8d037da6652

exflags = 1

exroot = 0

flags = 0

version = 0

default_uid = 0

default_gid = 0

default_fmode = 0

default_dmode = 0

map_uid_ct = 0

map_uid = (nil)

map_gid_ct = 0

map_gid = (nil)

}

procfs_args = struct {

fspec = 0x9f8d037da6652

exflags = 1

exroot = 0

}

3-48 AdvFS In-Memory Structures

Page 227: Dunix Student

Solutions

msfs_args = struct {

id = struct {

id1 = 937059922

id2 = 653520

tag = 1

}

}

ffm_args = struct {

ffm_flags = 937059922

f_un = union {

ffm_pname = 0x1

ffm_fdesc = 1

}

}

}

}

m_info = 0xfffffc0005ab5088

m_nfs_errmsginfo = struct {

n_noexport = 0

last_noexport = 0

n_stalefh = 0

last_stalefh = 0

}

m_unmount_lock = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0005ab2ba8

prev = 0xfffffc0005ab2ba8

}

l_caller = 4783020

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = 0xfffffc00025ac900

}

}

9. The m_info field of the mount structure contains a pointer to AdvFS private file system information. This information is the fileSetNode. Print it.

(dbx) set $fsn=(struct fileSetNode *)(((struct mount *)$m)->m_info)(dbx) px $fsn0xfffffc0003f26288 (dbx) p *(struct fileSetNode *)$fsnstruct {........ domainId = struct { tv_sec = 865089685 tv_usec = 897520 }....... bfSetH = struct { setH = 4

AdvFS In-Memory Structures 3-49

Page 228: Dunix Student

Solutions

dmnH = 2 } root_vp = 0xfffffc0001451200....... fileSetStats = struct { msfs_lookup = 8721216.......

You will see different types interesting information: domain ID, bitfile-set handle, pointer to the file system’s root directory, and lots of statistics.

(dbx) p (*(struct mount *)$m).m_info

0xfffffc0005ab5088

(dbx)

(dbx)

(dbx) set $fsn = 0xfffffc0005ab5088

(dbx)

(dbx)

(dbx) p *(struct fileSetNode *)$fsn

struct {

fsNext = (nil)

fsPrev = 0xfffffc0005ab5348

rootTag = struct {

num = 2

seq = 32769

}

tagsTag = struct {

num = 3

seq = 32769

}

filesetMagic = 2918187013

dmnP = 0xfffffc0000f24008

rootAccessp = 0xfffffc0005af7688

bfSetId = struct {

domainId = struct {

tv_sec = 937059922

tv_usec = 653520

}

dirTag = struct {

num = 1

seq = 32769

}

}

bfSetp = 0xfffffc0005b7ca08

root_vp = 0xfffffc0005ac98c0

fsFlags = 0

mountp = 0xfffffc0005ab2a80

quotaStatus = 1421

blkHLimit = 0

blkSLimit = 0

fileHLimit = 0

fileSLimit = 0

blksUsed = 1026404

filesUsed = 23704

3-50 AdvFS In-Memory Structures

Page 229: Dunix Student

Solutions

blkTLimit = 0

fileTLimit = 0

filesetMutex = struct {

mutex = 0

}

qi = {

[0] struct {

qiAccessp = 0xfffffc0005af7208

qiContext = 0xfffffc0005ac9bf0

qiTag = struct {

num = 4

seq = 32769

}

qiBlkTime = 604800

qiFileTime = 604800

qiFlags = ’^@’

qiPgSz = 8192

qiFilePgs = 2

qiCred = 0xfffffc0005ac6e40

qiLock = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0005ab5178

prev = 0xfffffc0005ab5178

}

l_caller = 0

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = (nil)

}

}

[1] struct {

qiAccessp = 0xfffffc0005af6d88

qiContext = 0xfffffc0005ac9e30

qiTag = struct {

num = 5

seq = 32769

}

qiBlkTime = 604800

qiFileTime = 604800

qiFlags = ’^@’

qiPgSz = 8192

qiFilePgs = 1

qiCred = 0xfffffc0005ac6fc0

qiLock = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0005ab51e0

prev = 0xfffffc0005ab51e0

}

l_caller = 0

AdvFS In-Memory Structures 3-51

Page 230: Dunix Student

Solutions

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = (nil)

}

}

}

fileSetStats = struct {

msfs_lookup = 99608

lookup = struct {

hit = 31366

hit_not_found = 1087

miss = 67149

}

msfs_create = 7

msfs_mknod = 0

msfs_open = 0

msfs_close = 5543

msfs_access = 113974

msfs_getattr = 182729

msfs_setattr = 13

msfs_read = 9986

msfs_write = 161

msfs_mmap = 1456

msfs_fsync = 1

msfs_seek = 2981

msfs_remove = 3

msfs_link = 0

msfs_rename = 0

msfs_mkdir = 0

msfs_rmdir = 0

msfs_symlink = 0

msfs_readdir = 4282

msfs_readlink = 2389

msfs_inactive = 86090

msfs_reclaim = 57420

msfs_page_read = 0

msfs_page_write = 0

msfs_getpage = 6706

msfs_putpage = 12

msfs_bread = 0

msfs_brelse = 0

msfs_lockctl = 29

msfs_setvlocks = 8

msfs_syncdata = 0

}

}

10. Use showfdmn to verify you have a domain ID match. The significant FAS-level structures have now been studied.

(dbx) sh pwd

/usr/bruden/advfs

3-52 AdvFS In-Memory Structures

Page 231: Dunix Student

Solutions

(dbx)

(dbx)

(dbx) sh showfdmn usr_domain

Id Date Created LogPgs Version Domain Name

37da6652.0009f8d0 Sat Sep 11 10:25:22 1999 512 4 usr_domain

Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name

1L 1426112 204960 86% on 256 256 /dev/disk/dsk1g

11. Print the access structure and see if its tag number matches your target.

(dbx) p *(struct bfAccess *)$bfastruct { fwd = 0xffffffff8077a3a8 bwd = 0xfffffc0000601c90....... tag = struct { num = 12572 seq = 32797 }.......

(dbx) p *(struct bfNode *)$bf

struct {

accessp = 0xfffffc0005aefb08

fsContextp = 0xfffffc00020ec0f0

tag = struct {

num = 23723

seq = 32771

}

bfSetId = struct {

domainId = struct {

tv_sec = 937059922

tv_usec = 653520

}

dirTag = struct {

num = 1

seq = 32769

}

}

}

(dbx)

(dbx)

(dbx) set $bfa = 0xfffffc0005aefb08

(dbx)

(dbx)

(dbx) p *(struct bfAccess *)$bfa

struct {

hashlinks = struct {

dh_links = struct {

dh_next = 0xfffffc0003659208

dh_prev = 0xfffffc000199b688

}

AdvFS In-Memory Structures 3-53

Page 232: Dunix Student

Solutions

dh_key = 2222450857

}

freeFwd = (nil)

freeBwd = (nil)

onFreeList = 0

accMagic = 2918187009

setFwd = 0xfffffc0003659688

setBwd = 0xfffffc00016a5b08

bfaLock = struct {

mutex = 0

}

accessCnt = 1

refCnt = 1

mmapCnt = 0

stateLk = struct {

hdr = struct {

lkType = LKT_STATE

nxtFtxLk = (nil)

mutex = 0xfffffc0005aefb48

lkUsage = LKU_BF_STATE

}

state = ACC_VALID

pendingState = LKW_NONE

waiters = 0

cv = 0

}

saved_stats = (nil)

bfVp = 0xfffffc00020ec000

bfObj = 0xfffffc00027f9380

bfIoLock = struct {

mutex = 0

}

dkResult = 0

miDkResult = 0

dirtyBufList = struct {

lsnFwd = (nil)

lsnBwd = (nil)

accFwd = 0xfffffc0005aefbb8

accBwd = 0xfffffc0005aefbb8

freeFwd = (nil)

freeBwd = (nil)

hashFwd = (nil)

hashBwd = (nil)

length = 0

touched = 0

ioOut = 0

lenLimit = 0

indexBuf = (nil)

}

cleanBufList = struct {

lsnFwd = (nil)

lsnBwd = (nil)

accFwd = 0xfffffc0005aefc10

accBwd = 0xfffffc0005aefc10

3-54 AdvFS In-Memory Structures

Page 233: Dunix Student

Solutions

freeFwd = (nil)

freeBwd = (nil)

hashFwd = (nil)

hashBwd = (nil)

length = 0

touched = 0

ioOut = 0

lenLimit = 0

indexBuf = (nil)

}

flushWait = 0

maxFlushWaiters = 0

hiFlushLsn = struct {

num = 0

}

hiWaitLsn = struct {

num = 0

}

nextFlushSeq = struct {

num = 2

}

flushWaiterQ = struct {

head = 0xfffffc0005aefc78

tail = 0xfffffc0005aefc78

cnt = 0

}

msyncWait = 0

msyncNum = 0

raHitPage = 0

raStartPage = 0

logWrite = (nil)

trunc_xfer_lk = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0005aefcb8

prev = 0xfffffc0005aefcb8

}

l_caller = 0

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = (nil)

}

cow_lk = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0005aefce8

prev = 0xfffffc0005aefce8

}

l_caller = 0

l_wait_writers = 0

l_readers = 0

AdvFS In-Memory Structures 3-55

Page 234: Dunix Student

Solutions

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = (nil)

}

nextCloneAccp = (nil)

origAccp = (nil)

cowPgCount = 0

cloneId = 0

cloneCnt = 0

maxClonePgs = 0

dataSafety = BFD_NIL

noClone = 0

deleteWithClone = 0

outOfSyncClone = 0

trunc = 0

cloneAccHRefd = 0

fragState = FS_FRAG_NONE

fragId = struct {

frag = 0

type = BF_FRAG_ANY

}

fragPageOffset = 0

bfPageSz = 16

reqServices = 1

optServices = 0

tag = struct {

num = 23723

seq = 32771

}

bfState = BSRA_VALID

transitionId = 30271

file_size = 0

bfSetp = 0xfffffc0005b7ca08

dmnP = 0xfffffc0000f24008

mcellList_lk = struct {

hdr = struct {

lkType = LKT_FTX

nxtFtxLk = (nil)

mutex = 0xfffffc0005aefb48

lkUsage = LKU_UNKNOWN

}

lock = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0005aefdb8

prev = 0xfffffc0005aefdb8

}

l_caller = 3296636

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

3-56 AdvFS In-Memory Structures

Page 235: Dunix Student

Solutions

l_lastlocker = 0xfffffc0001f83800

}

cv = 0

}

xtntMap_lk = struct {

hdr = struct {

lkType = LKT_FTX

nxtFtxLk = (nil)

mutex = 0xfffffc0005aefb48

lkUsage = LKU_UNKNOWN

}

lock = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0005aefe10

prev = 0xfffffc0005aefe10

}

l_caller = 3415076

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = 0xfffffc0001f83800

}

cv = 0

}

mapped = 1

nextPage = 0

extendSize = 0

primMCId = struct {

cell = 1

page = 985

}

primVdIndex = 1

xtnts = struct {

validFlag = 1

xtntMap = 0xfffffc0005a8f0e8

shadowXtntMap = (nil)

stripeXtntMap = (nil)

copyXtntMap = (nil)

migTruncLk = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0005aefe88

prev = 0xfffffc0005aefe88

}

l_caller = 0

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = (nil)

AdvFS In-Memory Structures 3-57

Page 236: Dunix Student

Solutions

}

type = BSXMT_APPEND

allocPageCnt = 0

}

dirTruncp = (nil)

putpage_lk = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0005aefec8

prev = 0xfffffc0005aefec8

}

l_caller = 3347236

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = 0xfffffc0001c92f00

}

real_bfap = (nil)

idx_params = (nil)

idxQuotaBlks = 0

largest_pl_num = 0

actRangeLock = struct {

mutex = 0

}

actRangeList = struct {

arFwd = 0xfffffc0005aeff10

arBwd = 0xfffffc0005aeff10

arCount = 0

arMaxLen = 0

arDioCount = 0

arDioWaiters = 0

}

}

(dbx)

12. Examine the back pointers to the vnode and VM object. Use these fields for additional confirmation that you have the right target. You will also see pointers to extent map information and the bitfile-set and domain structures.

(dbx) p *(struct bfAccess *)$batstruct {..... bfah = 452 vp = 0xfffffc00018e8800 obj = 0xfffffc00023d3860....... bfSetp = 0xfffffc0001240e08....... domainp = 0xfffffc000123e008....... xtnts = struct { validFlag = 1

3-58 AdvFS In-Memory Structures

Page 237: Dunix Student

Solutions

xtntMap = 0xfffffc0003b328a8 shadowXtntMap = (nil) stripeXtntMap = (nil)

13. Print the extent map.

For most bitfiles, this is not a very exciting structure; however, for bitfiles with many extents, this is the beginning of a mass of pointers. Note that the subXtntMap field is an array with validCnt elements. Each subXtntMap structure has an array of extents (bsXA) with cnt elements.

(dbx) p *(*(struct bfAccess *)$bfa).xtnts.xtntMap

struct {

nextXtntMap = (nil)

domain = 0xfffffc0000f24008

hdrType = 1

hdrVdIndex = 1

hdrMcellId = struct {

cell = 1

page = 985

}

blksPerPage = 16

nextValidPage = 0

allocDeallocPageCnt = 0

allocVdIndex = 65535

origStart = 0

origEnd = 0

updateStart = 1

updateEnd = 0

validCnt = 1

cnt = 1

maxCnt = 1

subXtntMap = 0xfffffc0001346f88

}

Now proceed to the bitfile-set, domain, and virtual disk data structures of AdvFS. Use the pointers of the bitfile access structure to find them.

14. Move from the bitfile access structure into the bitfile-set structure. Print it. There is a lot to see. Be sure to use the bitfile-set ID field to verify and match the values returned by showfsets.

(dbx) set $bfs=(bfSetT *)(((struct bfAccess *)$bfa)->bfSetp)(dbx) p $bfs(dbx) p *(bfSetT *)$bfsstruct { bfSetId = struct { domainId = struct { tv_sec = 864927707 tv_usec = 860832 } dirTag = struct { num = 1 seq = 32769 }

AdvFS In-Memory Structures 3-59

Page 238: Dunix Student

Solutions

}....... dmnp = 0xfffffc000123e008

(dbx) p (*(struct bfAccess *)$bfa).bfSetp

0xfffffc0005b7ca08

(dbx)

(dbx)

(dbx) set $bfs = 0xfffffc0005b7ca08

(dbx)

(dbx)

(dbx)

(dbx) whatis bfSetT

typedef struct bfSet {

dyn_hashlinks_w_keyT hashlinks;

bfSetName[32] of char ;

bfSetIdT bfSetId;

uint_t bfSetMagic;

int refCnt;

int logicalRefCnt;

domainT * dmnP;

bfsQueueT bfSetList;

mutexT accessChainLock;

bfAccessT * accessFwd;

bfAccessT * accessBwd;

dev_t dev;

bfTagT dirTag;

bfAccessT * dirBfAp;

bfSetT * cloneSetp;

bfSetT * origSetp;

uint32T cloneId;

uint32T cloneCnt;

uint32T numClones;

uint32T outOfSync;

mutexT cloneDelStateMutex;

stateLkT cloneDelState;

int xferThreads;

uint32T infoLoaded;

uint32T cachepolicy;

mutexT dirMutex;

ftxLkT dirLock;

bfsStateT state;

int bfCnt;

unsigned long tagFrLst;

unsigned long tagUnInPg;

unsigned long tagUnMpPg;

ftxLkT fragLock;

bfTagT fragBfTag;

bfAccessT * fragBfAp;

uint32T freeFragGrps;

uint32T truncating;

fragGrps[8] of fragGrpT ;

void * fsnp;

} bfSetT;

(dbx)

3-60 AdvFS In-Memory Structures

Page 239: Dunix Student

Solutions

(dbx)

(dbx) p *(bfSetT *)$bfs

struct {

hashlinks = struct {

dh_links = struct {

dh_next = 0xfffffc0005b7ca08

dh_prev = 0xfffffc0005b7ca08

}

dh_key = 937059923

}

bfSetName = "usr"

bfSetId = struct {

domainId = struct {

tv_sec = 937059922

tv_usec = 653520

}

dirTag = struct {

num = 1

seq = 32769

}

}

bfSetMagic = 2918187010

refCnt = 1

logicalRefCnt = 1

dmnP = 0xfffffc0000f24008

bfSetList = struct {

bfsQfwd = 0xfffffc0005b7c7e8

bfsQbck = 0xfffffc0005b7cce8

}

accessChainLock = struct {

mutex = 0

}

accessFwd = 0xfffffc0005ae4d88

accessBwd = 0xfffffc0005af7b08

dev = -518913147

dirTag = struct {

num = 1

seq = 32769

}

dirBfAp = 0xfffffc0005af8008

cloneSetp = (nil)

origSetp = (nil)

cloneId = 0

cloneCnt = 0

numClones = 0

outOfSync = 0

cloneDelStateMutex = struct {

mutex = 0

}

cloneDelState = struct {

hdr = struct {

lkType = LKT_STATE

nxtFtxLk = (nil)

mutex = 0xfffffc0005b7cac8

AdvFS In-Memory Structures 3-61

Page 240: Dunix Student

Solutions

lkUsage = LKU_CLONE_DEL

}

state = CLONE_DEL_NORMAL

pendingState = LKW_NONE

waiters = 0

cv = 0

}

xferThreads = 0

infoLoaded = 1

cachepolicy = 0

dirMutex = struct {

mutex = 0

}

dirLock = struct {

hdr = struct {

lkType = LKT_FTX

nxtFtxLk = (nil)

mutex = 0xfffffc0005b7cb10

lkUsage = LKU_UNKNOWN

}

lock = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0005b7cb40

prev = 0xfffffc0005b7cb40

}

l_caller = 3630684

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = 0xfffffc0001c92f00

}

cv = 0

}

state = BFS_READY

bfCnt = 23723

tagFrLst = 24

tagUnInPg = 24

tagUnMpPg = 24

fragLock = struct {

hdr = struct {

lkType = LKT_FTX

nxtFtxLk = (nil)

mutex = 0xfffffc0005b7cb10

lkUsage = LKU_UNKNOWN

}

lock = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0005b7cbb8

prev = 0xfffffc0005b7cbb8

}

3-62 AdvFS In-Memory Structures

Page 241: Dunix Student

Solutions

l_caller = 3630684

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = 0xfffffc0001c92600

}

cv = 0

}

fragBfTag = struct {

num = 1

seq = 32769

}

fragBfAp = 0xfffffc0005af7b08

freeFragGrps = 1

truncating = 0

fragGrps = {

[0] struct {

firstFreeGrp = 7280

lastFreeGrp = 32

}

[1] struct {

firstFreeGrp = 7216

lastFreeGrp = 5024

}

[2] struct {

firstFreeGrp = 7168

lastFreeGrp = 7168

}

[3] struct {

firstFreeGrp = 7248

lastFreeGrp = 7248

}

[4] struct {

firstFreeGrp = 7232

lastFreeGrp = 7232

}

[5] struct {

firstFreeGrp = 7120

lastFreeGrp = 7120

}

[6] struct {

firstFreeGrp = 7056

lastFreeGrp = 7056

}

[7] struct {

firstFreeGrp = 7264

lastFreeGrp = 7264

}

}

fsnp = 0xfffffc0005ab5088

}

(dbx)

AdvFS In-Memory Structures 3-63

Page 242: Dunix Student

Solutions

(dbx)

(dbx) sh showfsets usr_domain

usr

Id : 37da6652.0009f8d0.1.8001

Files : 23704, SLim= 0, HLim= 0

Blocks (512) : 1026436, SLim= 0, HLim= 0

Quota Status : user=off group=off

var

Id : 37da6652.0009f8d0.2.8001

Files : 970, SLim= 0, HLim= 0

Blocks (512) : 165004, SLim= 0, HLim= 0

Quota Status : user=off group=off

15. Now print the domain structure. (Do not use struct domain unless you want the structure for socket domains.) In the middle of this structure, you will see an array for pointers to virtual disk structures. There are also many fields used to control file domain I/O.

(dbx) set $d=(domainT *)(((bfSetT *)$bfs)->dmnp)(dbx) p $d0xfffffc000123e008 (dbx) p *(domainT *)$dstruct {....... domainName = "usr_domain"....... vdpTbl = { [0] 0xfffffc0003a18388.......

(dbx) p (*(bfSetT *)$bfs).dmnp

0xfffffc0000f24008

(dbx)

(dbx)

(dbx) set $d = 0xfffffc0000f24008

(dbx)

(dbx)

(dbx) p *(domainT *)$d

struct {

mutex = struct {

mutex = 0

}

dmnMagic = 2918187011

dmnFwd = 0xfffffc0000f24008

dmnBwd = 0xfffffc0000f24008

dmnHashlinks = struct {

dh_links = struct {

dh_next = 0xfffffc0000f24008

dh_prev = 0xfffffc0000f24008

}

dh_key = 937059922

}

dmnVersion = 4

state = BFD_ACTIVATED

3-64 AdvFS In-Memory Structures

Page 243: Dunix Student

Solutions

domainId = struct {

tv_sec = 937059922

tv_usec = 653520

}

dualMountId = struct {

tv_sec = 0

tv_usec = 0

}

bfDmnMntId = struct {

tv_sec = 938694756

tv_usec = 891025

}

dmnAccCnt = 4

dmnRefWaiters = 0

activateCnt = 2

mountCnt = 2

bfSetDirp = 0xfffffc0005b7c788

bfSetDirTag = struct {

num = 4294967288

seq = 0

}

BfSetTblLock = struct {

hdr = struct {

lkType = LKT_FTX

nxtFtxLk = (nil)

mutex = 0xfffffc0000f24008

lkUsage = LKU_UNKNOWN

}

lock = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0000f240a8

prev = 0xfffffc0000f240a8

}

l_caller = 3271972

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = 0xfffffc000320bb00

}

cv = 0

}

bfSetHead = struct {

bfsQfwd = 0xfffffc0005b7cce8

bfsQbck = 0xfffffc0005b7c7e8

}

bfSetDirAccp = 0xfffffc0005af8488

ftxLogTag = struct {

num = 4294967287

seq = 0

}

ftxLogP = 0xfffffc0005baec48

AdvFS In-Memory Structures 3-65

Page 244: Dunix Student

Solutions

ftxLogPgs = 512

logAccessp = 0xfffffc0005af8908

ftxTbld = struct {

rrNextSlot = 13

rrSlots = 30

ftxWaiters = 0

trimWaiters = 0

excWaiters = 0

slotCv = 0

trimCv = 0

excCv = 0

logTrimLsn = struct {

num = 0

}

nextNewSlot = 30

oldestFtxLa = struct {

read = 782

update = 783

lgra = {

[0] struct {

page = 184

offset = 1790

lsn = struct {

num = 2821672

}

}

[1] struct {

page = 185

offset = 656

lsn = struct {

num = 0

}

}

}

}

lastFtxId = 68203

slotUseCnt = 0

noTrimCnt = 0

tablep = 0xfffffc0001360808

oldestSlot = 13

totRoots = 68203

}

pinBlockBuf = (nil)

domainName = "usr_domain"

majorNum = 2055

flag = BFD_NORMAL

lsnLock = struct {

mutex = 0

}

lsnList = struct {

lsnFwd = 0xfffffe0407605e70

lsnBwd = 0xfffffe0407605e70

accFwd = (nil)

accBwd = (nil)

3-66 AdvFS In-Memory Structures

Page 245: Dunix Student

Solutions

freeFwd = (nil)

freeBwd = (nil)

hashFwd = (nil)

hashBwd = (nil)

length = 1

touched = 0

ioOut = 0

lenLimit = 0

indexBuf = (nil)

}

writeToLsn = struct {

num = 0

}

pinBlockWait = 0

pinBlockCv = 0

pinBlockRunning = 0

contBits = 0

dirtyBufLa = struct {

read = 266

update = 266

lgra = {

[0] struct {

page = 185

offset = 656

lsn = struct {

num = 2821708

}

}

[1] struct {

page = 185

offset = 603

lsn = struct {

num = 2821706

}

}

}

}

scLock = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0000f24308

prev = 0xfffffc0000f24308

}

l_caller = 3557160

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = 0xfffffc0001c93500

}

scTbl = 0xfffffc00035fe008

vdpTblLock = struct {

mutex = 0

AdvFS In-Memory Structures 3-67

Page 246: Dunix Student

Solutions

}

vdCnt = 1

vdpTbl = {

[0] 0xfffffc0000f2b508

[1] (nil)

[2] (nil)

(…)

[255] (nil)

}

rmvolTruncLk = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0000f24b50

prev = 0xfffffc0000f24b50

}

l_caller = 3777480

l_wait_writers = 0

l_readers = 0

l_flags = '^@'

l_lifms = '\200'

l_info = 0

l_lastlocker = 0xfffffc00022f4300

}

bcStat = struct {

pinHit = 13744

pinHitWait = 207

pinRead = 0

refHit = 319958

refHitWait = 128

raBuf = 3376

ubcHit = 2107

unpinCnt = struct {

lazy = 13391

blocking = 56

clean = 10

log = 908

}

derefCnt = 330259

devRead = 9034

devWrite = 1790

unconsolidate = 0

consolAbort = 0

unpinFileType = struct {

meta = 10673

ftx = 12051

}

derefFileType = struct {

meta = 137070

ftx = 318466

}

}

bmtStat = struct {

fStatRead = 0

fStatWrite = 67009

3-68 AdvFS In-Memory Structures

Page 247: Dunix Student

Solutions

resv1 = 0

resv2 = 0

bmtRecRead = {

[0] 0

[1] 0

[2] 0

[3] 0

[4] 0

[5] 0

[6] 0

[7] 0

[8] 0

[9] 0

[10] 0

[11] 0

[12] 0

[13] 0

[14] 0

[15] 0

[16] 0

[17] 0

[18] 0

[19] 0

[20] 0

[21] 0

}

bmtRecWrite = {

[0] 0

[1] 0

[2] 81

[3] 0

[4] 0

[5] 0

[6] 0

[7] 0

[8] 33

[9] 0

[10] 0

[11] 0

[12] 0

[13] 0

[14] 0

[15] 1

[16] 84

[17] 2

[18] 23

[19] 0

[20] 0

[21] 0

}

}

logStat = struct {

logWrites = 382

transactions = 12731

AdvFS In-Memory Structures 3-69

Page 248: Dunix Student

Solutions

segmentedRecs = 2

logTrims = 0

wastedWords = 35019

maxLogPgs = 59

minLogPgs = 0

maxFtxWords = 317

maxFtxAgent = 65

maxFtxTblSlots = 29

oldFtxTblAgent = 0

excSlotWaits = 0

fullSlotWaits = 2

rsv1 = 0

rsv2 = 0

rsv3 = 0

rsv4 = 0

}

totalBlks = 1426112

freeBlks = 204928

dmn_panic = 0

xidRecovery = struct {

head = (nil)

tail = (nil)

current_free_slot = 0

timestamp = struct {

tv_sec = 0

tv_usec = 0

}

}

xidRecoveryLk = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0000f24e08

prev = 0xfffffc0000f24e08

}

l_caller = 0

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = (nil)

}

smsync_policy = 0

metaPagep = 0xfffffe0400299008

fs_full_time = 0

}

(dbx)

16. The last major structure to print is used for virtual disks. You will see even more I/O control substructures here.

(dbx) set $vd=(struct vd *)(((domainT *)$d)->vdpTbl[0])(dbx) p $vd0xfffffc0003a18388 (dbx) p *(struct vd *)$vd

3-70 AdvFS In-Memory Structures

Page 249: Dunix Student

Solutions

struct {....... vdName = "/dev/disk/dsk2g"........ freeStgLst = 0xfffffc0003fcf188

(dbx) p (*(domainT *)$d).vdpTbl[0]

0xfffffc0000f2b508

(dbx)

(dbx)

(dbx) set $vd = 0xfffffc0000f2b508

(dbx)

(dbx)

(dbx) p *(struct vd *)$vd

struct {

stgCluster = 16

devVp = 0xfffffc0001a8c6c0

vdMagic = 2918187012

rbmtp = 0xfffffc0005af9688

bmtp = 0xfffffc0005af9208

sbmp = 0xfffffc0005af8d88

dmnP = 0xfffffc0000f24008

vdIndex = 1

maxPgSz = 16

bmtXtntPgs = 128

vdName = "/dev/disk/dsk1g"

vdState = BSR_VD_MOUNTED

vdSetupThd = (nil)

vdRefCnt = 0

vdRefWaiters = 0

vdStateLock = struct {

mutex = 0

}

vdSize = 1426112

vdSectorSize = 512

vdClusters = 89132

serviceClass = 1

mcell_lk = struct {

hdr = struct {

lkType = LKT_FTX

nxtFtxLk = (nil)

mutex = 0xfffffc0000f24008

lkUsage = LKU_UNKNOWN

}

lock = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0000f2b9a0

prev = 0xfffffc0000f2b9a0

}

l_caller = 3630684

l_wait_writers = 0

l_readers = 0

AdvFS In-Memory Structures 3-71

Page 250: Dunix Student

Solutions

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = 0xfffffc00022f4600

}

cv = 0

}

nextMcellPg = 985

rbmt_mcell_lk = struct {

hdr = struct {

lkType = LKT_FTX

nxtFtxLk = (nil)

mutex = 0xfffffc0000f24008

lkUsage = LKU_UNKNOWN

}

lock = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0000f2ba00

prev = 0xfffffc0000f2ba00

}

l_caller = 0

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = (nil)

}

cv = 0

}

lastRbmtPg = 0

rbmtFlags = 0

stgMap_lk = struct {

hdr = struct {

lkType = LKT_FTX

nxtFtxLk = (nil)

mutex = 0xfffffc0000f24008

lkUsage = LKU_UNKNOWN

}

lock = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0000f2ba60

prev = 0xfffffc0000f2ba60

}

l_caller = 3575624

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = 0xfffffc0001c93500

}

3-72 AdvFS In-Memory Structures

Page 251: Dunix Student

Solutions

cv = 0

}

freeStgLst = 0xfffffc0001a8e928

numFreeDesc = 50

freeClust = 12807

scanStartClust = 34044

bitMapPgs = 2

spaceReturned = 1

fill1 = (nil)

fill3 = (nil)

fill4 = 0

del_list_lk = struct {

hdr = struct {

lkType = LKT_FTX

nxtFtxLk = (nil)

mutex = 0xfffffc0000f24008

lkUsage = LKU_UNKNOWN

}

lock = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0000f2baf0

prev = 0xfffffc0000f2baf0

}

l_caller = 3630684

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = 0xfffffc00022f4600

}

cv = 0

}

ddlActiveLk = struct {

l_lock = 0

l_head = struct {

next = 0xfffffc0000f2bb28

prev = 0xfffffc0000f2bb28

}

l_caller = 3365256

l_wait_writers = 0

l_readers = 0

l_flags = ’^@’

l_lifms = ’\200’

l_info = 0

l_lastlocker = 0xfffffc00022f4600

}

ddlActiveWaitMCId = struct {

cell = 0

page = 0

}

ddlActiveWaitCv = 0

dStat = struct {

AdvFS In-Memory Structures 3-73

Page 252: Dunix Student

Solutions

nread = 9034

nwrite = 1806

readblk = 194256

writeblk = 43648

seekCnt = 0

rglobBuf = 3543

rglobBlk = 56688

rglob = 436

wglobBuf = 1292

wglobBlk = 20672

wglob = 370

blockingQ = 0

waitLazyQ = 3

readyLazyQ = 0

consolQ = 0

devQ = 0

}

vdIoLock = struct {

mutex = 0

}

blockingQ = struct {

fwd = 0xfffffc0000f2bbe0

bwd = 0xfffffc0000f2bbe0

length = 0

lenLimit = 0

}

waitLazyQ = struct {

fwd = 0xfffffe0407605fa0

bwd = 0xfffffe0407605fa0

length = 1

lenLimit = 0

}

smSyncQ = {

[0] struct {

fwd = 0xfffffc0000f2bc10

bwd = 0xfffffc0000f2bc10

length = 0

lenLimit = 0

}

[1] struct {

fwd = 0xfffffc0000f2bc28

bwd = 0xfffffc0000f2bc28

length = 0

lenLimit = 0

}

(…)

[15] struct {

fwd = 0xfffffc0000f2bd78

bwd = 0xfffffc0000f2bd78

length = 0

lenLimit = 0

}

}

readyLazyQ = struct {

3-74 AdvFS In-Memory Structures

Page 253: Dunix Student

Solutions

fwd = 0xfffffc0000f2bd90

bwd = 0xfffffc0000f2bd90

length = 0

lenLimit = 1024

}

consolQ = struct {

fwd = 0xfffffc0000f2bda8

bwd = 0xfffffc0000f2bda8

length = 0

lenLimit = 0

}

devQ = struct {

fwd = 0xfffffc0000f2bdc0

bwd = 0xfffffc0000f2bdc0

length = 0

lenLimit = 0

}

blockingCnt = 0

blockingFact = 4

rdmaxio = 256

wrmaxio = 256

vdIoOut = 0

start_active = 0

gen_active = 0

active = struct {

hdr = struct {

lkType = LKT_STATE

nxtFtxLk = (nil)

mutex = 0xfffffc0000f2bbd8

lkUsage = LKU_VD_ACTIVE

}

state = INACTIVE_DISK

pendingState = LKW_NONE

waiters = 0

cv = 0

}

advfs_start_more_posted = 0

blkQ_cnt = 12899

lazyQ_cnt = 13413

smsyncQ_cnt = 3706

readyQ_cnt = 2027

consolQ_cnt = 1976

devQ_cnt = 1970

rmioq_cnt = 11441

rmormvq_cnt = 1

syncQIndx = 12

consolidate = 1

max_iosize_rd = 1048576

max_iosize_wr = 1048576

preferred_iosize_rd = 131072

preferred_iosize_wr = 131072

qtodev = 3

freeRsvdStg = struct {

start_clust = 1545

AdvFS In-Memory Structures 3-75

Page 254: Dunix Student

Solutions

num_clust = 4370

prevp = 0xfffffc0000f2be90

nextp = 0xfffffc0000f2be90

}

}

17. For the finale, print the data structure of the free storage cache.

(dbx) set $fst=(stgDescT *)(((struct vd *)$vd)->freeStgLst)(dbx) p $fst0xfffffc0003fcf188 (dbx) p *(stgDescT *)$fststruct { start_clust = 705832 num_clust = 14360 prevp = 0xfffffc0003fcf188 nextp = 0xfffffc0003fcf188}

(dbx) p (*(struct vd *)$vd).freeStgLst

0xfffffc0001a8e928

(dbx)

(dbx)

(dbx) set $fst = 0xfffffc0001a8e928

(dbx)

(dbx)

(dbx) p *(stgDescT *)$fst

struct {

start_clust = 6137

num_clust = 3745

prevp = 0xfffffc0001a8ef48

nextp = 0xfffffc0001a8e948

}

3-76 AdvFS In-Memory Structures

Page 255: Dunix Student

4

AdvFS System Calls and Kernel Interfaces

AdvFS System Calls and Kernel Interfaces 4-1

Page 256: Dunix Student

About This Chapter

About This Chapter

IntroductionThis chapter presents information on entries into AdvFS. Specifically:

• Virtual file system (VFS) switch table

• The vnode switch table

• Unified Buffer Cache (UBC) interface

• Device driver callback

• Lightweight context (LWC) interface

• I/O completion function

• True AdvFS system call

• Types of AdvFS system calls

• Domains and volumes

• Filesets

• Miscellaneous operations

• Algorithms for startup and recovery

• Storage management algorithms

• Cloning algorithms

• File migration and deletion algorithms

• Algorithms for threads

4-2 AdvFS System Calls and Kernel Interfaces

Page 257: Dunix Student

About This Chapter

ObjectivesTo describe the entries into AdvFS, you should be able to:

• List the various entry points to AdvFS.

• Determine how an AdvFS system call is processed.

• Describe the algorithms for startup and recovery.

• Explain the storage management algorithms.

• Define the cloning algorithms.

• Describe the file migration and deletion algorithms.

• Describe the algorithms for threads.

ResourcesFor more information on topics in this chapter, see the following sources:

• /usr/include/sys/mount.h

• /usr/include/sys/vnode.h

• msfs/osf/msfs_vfsops.c

• msfs/osf/msfs_vnops.c

• msfs/osf/msfs_misc.c

• msfs/osf/msfs_io.c

• msfs/bs/bs_qio.c

• msfs/bs/bs_misc.c

• kernel/msfs/msfs/msfs_syscalls.h

AdvFS System Calls and Kernel Interfaces 4-3

Page 258: Dunix Student

Describing Entries to AdvFS

Describing Entries to AdvFS

OverviewThis section describes the interaction between AdvFS and the kernel. The material provides some familiarity with routines appearing in crashes and live debugging.

• VFS switch table

• File-related system calls (VFS and vnode switch table)

• Unified buffer cache interface

• Device driver interface routines

• AdvFS system calls

VFS Switch TableThe VFS switch table defines a series of pointers to functions implementing many of the file system level activities. The VFS switch table has:

• Thirteen entry points for file system operations (includes V5 smooth synchronization)

• An interface defined in:

— /usr/include/sys/mount.h

— struct vfsops * m_op;

• The interface is implemented in:

— msfs/osf/msfs_vfsops.c

The following excerpt from msfs_vfsops.c shows the 13 VFS switch table routine names for AdvFS (MSFS) activities.

Example 4-1: VFS Switch Table Routine List

/* * msfs_vfsops * * Defines function pointers to AdvFS specific VFS fs operations. */struct vfsops msfs_vfsops = { msfs_mount, msfs_start, msfs_unmount, msfs_root, advfs_quotactl, msfs_statfs, msfs_sync, msfs_fhtovp,

4-4 AdvFS System Calls and Kernel Interfaces

Page 259: Dunix Student

Describing Entries to AdvFS

msfs_vptofh, msfs_init, msfs_mountroot, msfs_noop, msfs_smoothsync,};

vnode Switch TableThe vnode switch table defines a series of pointers to functions implementing most of the file-oriented AdvFS activities. The vnode switch table has:

• Forty two entry points for file operations.

• An interface defined in:

— /usr/include/sys/vnode.h

— struct vnodeops * v_op;

• The interface is implemented in msfs/osf/msfs_vnops.c

The example lists the 42 entry points for vnode operations involving AdvFS.

Example 4-2: File (vnode) Operations

/* * msfs_vnodeops * * Defines function pointers to AdvFS specific VFS vnode operations. */struct vnodeops msfs_vnodeops = { msfs_lookup, /* lookup */ msfs_create, /* create */ msfs_mknod, /* mknod */ msfs_open, /* open */ msfs_close, /* close */ msfs_access, /* access */ msfs_getattr, /* getattr */ msfs_setattr, /* setattr */ msfs_read, /* read */ msfs_write, /* write */ msfs_ioctl, /* ioctl */ seltrue, /* select */ msfs_mmap, /* mmap */ msfs_fsync, /* fsync */ msfs_seek, /* seek */ msfs_remove, /* remove */ msfs_link, /* link */ msfs_rename, /* rename */ msfs_mkdir, /* mkdir */ msfs_rmdir, /* rmdir */ msfs_symlink, /* symlink */ msfs_readdir, /* readdir */ msfs_readlink, /* readlink */ msfs_abortop, /* abortop */

AdvFS System Calls and Kernel Interfaces 4-5

Page 260: Dunix Student

Describing Entries to AdvFS

msfs_inactive, /* inactive */ msfs_reclaim, /* reclaim */ msfs_bmap, /* bmap */ msfs_strategy, /* strategy */ msfs_print, /* print */ msfs_page_read, /* page_read */ msfs_page_write, /* page_write */ msfs_swap, /* swap handler */ msfs_bread, /* buffer read */ msfs_brelse, /* buffer release */ msfs_lockctl, /* file locking */ msfs_syncdata, /* fsync byte range */ msfs_noop, /* Lock a node */ msfs_noop, /* Unlock a node */ msfs_getproplist, /* Get extended attributes */ msfs_setproplist, /* Set extended attributes */ msfs_delproplist, /* Delete extended attributes */ msfs_pathconf, /* pathconf */};

UBC InterfaceUltimately, data read from an AdvFS file system ends up in a memory location that is associated with the unified buffer cache (UBC). The UBC interface consists of:

• vnode operations used in paging.

• msfs_getpage to obtain a page from disk.

• msfs_putpage to write a page to the disk.

The implementation is in msfs/osf/msfs_misc.c.

Device Driver Interface RoutinesEventually an AdvFS I/O must cause device activity to begin. The device activity is triggered by device driver routines. The device driver interface consists of:

• In AdvFS struct buf:

— b_iodone field contains address of msfs_iodone()

— A buffer of data is representated

— Listhead is bsBufList

• At interrupt: device driver calls msfs_iodone()

4-6 AdvFS System Calls and Kernel Interfaces

Page 261: Dunix Student

Describing Entries to AdvFS

s

one of:

• msfs_iodone()

— Temporarily raises system priority level.

— Places buffer on MsfsIodoneBuf queue (holds completed I/O operationfor AdvFS) found within the processor structure.

— Posts LWC_PRI_MSFS_UBC.

The implementation is in msfs/osf/msfs_io.c.

AdvFS Lightweight Context InterfaceAfter the high system priority level processing, the remaining buffer work is dthrough the lightweight context (LWC) interface. The LWC interface consists

• Priority: LWC_PRI_MSFS_UBC

• Entry: msfs_async_iodone_lwc()

• msfs_async_iodone_lwc()

— Removes buffer from MsfsIodoneBuf

— Calls bs_osf_complete()

The implementation is in msfs/osf/msfs_io.c.

AdvFS I/O Completion FunctionThe I/O completion function:

• Checks for many errors

— If appropriate, prints error messages

— If error while writing to log, panic kernel

• Call bs_io_complete() to reach BAS layer

• Initiates more I/O if appropriate

Source location is msfs/bs/bs_qio.c.

AdvFS System Calls and Kernel Interfaces 4-7

Page 262: Dunix Student

Describing Entries to AdvFS

True AdvFS System CallThe true AdvFS system calls consist of:

• msfs_real_syscall()

— Single call, many flavors

— Called through MsfsSyscallp (filled in when AdvFS is started) with thelower 32 bits of the KSEG address of msfs_real_syscall()

— MsfsSyscallp + 0xfffffc0000000000 = &msfs_real_syscall()

• First argument is operation type (used in large case statement to determine the action)

Source location: msfs/bs/bs_misc.c.

The following code shows the argument list for msfs_real_syscall ().

Example 4-3: Prototype for msfs_real_syscall()

intmsfs_real_syscall( opTypeT opType, /* in - msfs operation to be performed */ libParamsT *parmBuf, /* in - ptr to op-specific parameters buffer;*/ /* contents are modified. */ int parmBufLen /* in - byte length of parmBuf */ ){

Types of AdvFS System CallsThere are 60 operation types of AdvFS system calls. The user interface (library wrappers for system calls) are:

• Compiled into /usr/shlib/libmsfs.so.

• Included from msfs_syscalls.h.

The source is located at kernel/msfs/msfs/msfs_syscalls.h .

The following excerpt from msfs_syscalls.h lists the operation types that could be in the first argument to msfs_real_syscall ().

Example 4-4: Operation Types within msfs_real_syscall()

typedef enum { OP_NONE, OP_GET_BF_PARAMS, OP_SET_BF_ATTRIBUTES, OP_GET_BF_XTNT_MAP, OP_ADD_STG, OP_ADD_OVER_STG, OP_MIGRATE,

4-8 AdvFS System Calls and Kernel Interfaces

Page 263: Dunix Student

Describing Entries to AdvFS

OP_DMN_INIT, OP_GET_DMNNAME_PARAMS, OP_GET_DMN_PARAMS, OP_SET_DMN_PARAMS, OP_GET_DMN_VOL_LIST, OP_GET_VOL_PARAMS, OP_SET_VOL_IOQ_PARAMS, OP_DUMP_LOCKS, OP_TRACE, OP_FSET_CREATE, OP_FSET_DELETE, OP_FSET_CLONE, OP_FSET_GET_INFO, OP_FSET_GET_ID, OP_GET_BFSET_PARAMS, OP_SET_BFSET_PARAMS, OP_ADD_VOLUME, OP_CRASH, OP_MSS_RESV1,

(...) OP_MSS_RESV17, OP_UNDEL_ATTACH, OP_UNDEL_DETACH, OP_UNDEL_GET, OP_GET_NAME, OP_REM_STG, OP_EVENT, OP_TAG_STAT, OP_SWITCH_LOG, OP_GET_BF_IATTRIBUTES, OP_SET_BF_IATTRIBUTES, OP_MOVE_BF_METADATA, OP_GET_VOL_BF_DESCS, OP_REM_VOLUME, OP_ADD_REM_VOL_SVC_CLASS, OP_SWITCH_ROOT_TAGDIR, OP_SET_BF_NEXT_ALLOC_VOL, OP_DISK_ERROR, OP_FTX_PROF, OP_REWRITE_XTNT_MAP, OP_RESET_FREE_SPACE_CACHE, OP_SET_NEXT_TAG, OP_REM_NAME, OP_REM_BF, OP_FSET_RENAME, OP_GET_LOCK_STATS, OP_FSET_GET_STATS, OP_GET_BKUP_XTNT_MAP, OP_GET_VOL_PARAMS2, OP_GET_GLOBAL_STATS, OP_GET_SMSYNC_STATS, OP_GET_IDX_BF_PARAMS, OP_ADD_REM_VOL_DONE} opIndexT;

AdvFS System Calls and Kernel Interfaces 4-9

Page 264: Dunix Student

Describing Entries to AdvFS

Domains and VolumesThis section specifies some of the central routines associated with some common AdvFS domain and volume commands. Calls related to domains and volumes consist of a utility:

• msfs_dmn_init() mkfdmn

The following example shows the argument list for msfs_dmn_init().

Example 4-5: Prototype for msfs_dmn_init()

mlStatusTmsfs_dmn_init( char* domain, /* in - bf domain name */ int maxVols, /* in - maximum number of virtual disks */ u32T logPgs, /* in - number of pages in log */ mlServiceClassT logSvc, /* in - log service attributes */ mlServiceClassT tagSvc, /* in - tag directory service attributes */ char *volName, /* in - block special device name */ mlServiceClassT volSvc, /* in - service class */ u32T volSize, /* in - size of the virtual disk */ u32T bmtXtntPgs, /* in - number of pages per BMT extent */ u32T bmtPreallocPgs, /* in - number of pages to be preallocated for the BMT */ u32T domainVersion, /* in - on-disk version of domain */ mlBfDomainIdT* bfDomainId /* out - domain id */ );

• msfs_add_volume() addvol

The following example shows the argument list for msfs_add_volume().

Example 4-6: Prototype for msfs_add_volume()

mlStatusTmsfs_add_volume( char *domain, /* in - domain name */ char *volName, /* in - block special device name */ mlServiceClassT *volSvc, /* in/out - service class */ u32T volSize, /* in - size of the virtual disk */ u32T bmtXtntPgs, /* in - number of pages per BMT extent */ u32T bmtPreallocPgs, /* in - number of pages to be preallocated for the BMT */ mlBfDomainIdT *bfDomainId, /* out - the domain id */ u32T *volIndex /* out - vol index */ );

• advfs_remove_volume() rmvol

The following example shows the argument list for advfs_remove_volume().

4-10 AdvFS System Calls and Kernel Interfaces

Page 265: Dunix Student

Describing Entries to AdvFS

Example 4-7: Prototype for advfs_remove_volume()

mlStatusTadvfs_remove_volume( mlBfDomainIdT bfDomainId, /* in */ u32T volIndex, /* in */ u32T forceFlag /* in */ );

• msfs_syscall_op_get_dmn_params() showfdmn

This example shows the msfs_syscall_op_get_dmn_params() argument list.

Example 4-8: Prototype for msfs_syscall_op_get_dmn_params()

mlStatusTmsfs_syscall_op_get_dmn_params( libParamsT *libBufp);

• msfs_syscall_op_get_dmn_vol_list()

FilesetsThis section specifies some of the routines associated with some common fileset- oriented commands. Calls related to filesets consist of:

• System call

• Utility

Routines include:

• msfs_fset_create() mkfset

The following example shows the argument list for msfs_fset_create() routine found in msfs_syscall.h.

Example 4-9: Prototype for msfs_fset_create()

mlStatusTmsfs_fset_create( char *domain, /* in - domain name */ char *setName, /* in - set’s name */ mlServiceClassT reqServ, /* in - required service class */ mlServiceClassT optServ, /* in - optional service class */ u32T userId, /* in - user id */ gid_t quotaId, /* in - group ID for quota files */ mlBfSetIdT *bfSetId /* out - bitfile set id */ );

AdvFS System Calls and Kernel Interfaces 4-11

Page 266: Dunix Student

Describing Entries to AdvFS

The other routines mentioned in this section are also prototyped in msfs_syscalls.h.

• msfs_fset_clone() clonefset

• msfs_fset_delete() rmfset

• msfs_set_bfset_params() chfsets

And many more.

Miscellaneous OperationsHere are a few more miscellaneous operations prototyped in msfs_syscalls.h:advfs_migrate(), which move blocks of an open file.

• msfs_syscall_op_set_bf_attributes()

Stripes a file

• msfs_undel_attach()

Attaches a trashcan directory

4-12 AdvFS System Calls and Kernel Interfaces

Page 267: Dunix Student

Starting Up and Recovering in AdvFS

Starting Up and Recovering in AdvFS

OverviewThis section discusses the logic behind some common AdvFS-specific activities. This material should serve as a guide to important routines supporting AdvFS.

• Startup and recovery overview

• Mounting the file system

• Activating the bitfile-set

• Activating the domain

• Recovering a domain

Startup and Recovery OverviewStarting up AdvFS involves several steps:

1. Begin with a mount(2) system call or vfs_mountroot() which does part of the job.

2. Invoke msfs_mount() found in msfs_vfsops.c.

3. Call get_domain_disks().

This searches the /etc/fdmns/domain for a list of virtual disks.

4. Call advfs_mountfs()(found in msfs_vfsops.c) to do the real work.

Mounting the File SystemTo mount the file system:

1. Obtain names of the fileset.

2. Activate the bitfile-set with bs_bfset_activate().

3. Initialize various in-memory structures.

4. Open significant bitfiles (tagdir, root, fragment).

5. Link the file system into mount list.

Source location: msfs/osf/msfs_vfsops.c

AdvFS System Calls and Kernel Interfaces 4-13

Page 268: Dunix Student

Starting Up and Recovering in AdvFS

ry

Activating the Bitfile-SetUse bs_bfset_activate_int() to activate or find a domain structure.

• bs_bfdmn_tbl_activate() finds the appropriate bitfile-set.

• bs_bfs_find_set() looks in the root tag directory.

Source location: msfs/bs/bs_bitfile_sets.c

Activating the Domain and Searching for Virtual DisksUse bs_bfdmn_tbl_activate()to search for virtual disks. If the domain is not active:

1. Search virtual disks of domain.

2. Check for consistencies:

— Virtual disk count on disk

— Number of links in /etc/fdmns

3. Find the transaction log.

4. Activate the domain using bs_bfdmn_activate().

Source location: msfs/bs/bs_domain.c

Activating the Domain: Full ActivationUse bs_bfdmn_activate() to activate the domain.

1. Open the transaction log using lgr_open().

2. Open root tag directory when appropriate.

3. Start crash recovery activities with ftx_bfdmn_recovery().

4. Remove delete-pending filesets.

Source location: msfs/bs/bs_domain.c

Recovering a DomainUse ftx_bfdmn_recovery() to recover a domain with one of three recovepasses:

• Pass 1 -- RBMT file

• Pass 2 -- Other reserved metadata bitfiles

• Pass 3 -- Other metadata bitfiles

After the three passes, perform any further recovery actions.

Source location: msfs/bs/ftx_recovery.c.

4-14 AdvFS System Calls and Kernel Interfaces

Page 269: Dunix Student

Starting Up and Recovering in AdvFS

Recovery Pass: Recovers Domain ConsistencyUse ftx_recovery_pass() to recover domain consistency.

To scan the log:

1. Read a record.

2. Put in slot for this FTX ID (and allocate new one if needed).

a. On pass 1, buffer continuation and root done record.

b. If record matches current pass, perform:

* Record image redo records.

* Operation redo record.

c. If level and member are zero, free the FTX slot.

3. Perform routine ftx_recovery_pass() of msfs/bs/ftx_recovery.c.

4. Loop through remaining FTX slots.

5. If level is not zero, this is part of an uncompleted transaction.

a. Fail the transaction.

b. Execute the undo records in pass appropriate manner.

6. If level is zero, better perform the root done operations.

AdvFS System Calls and Kernel Interfaces 4-15

Page 270: Dunix Student

Providing Storage Management

disk

tial

nt

Providing Storage Management

OverviewThis section discusses routines that provide storage allocation.

• Bitfile access subsystem (BAS) level storage allocation

• File access subsystem (FAS) level storage allocation

• Truncating bitfiles, fragment creation

BAS-Level Storage AllocationSome storage bitmap (SBM) information is cached in memory data structures:

• Disk free storage list:

— Starting address and size of free storage

— May not be large enough to hold all free storage locations, especially if is very fragmented

• BAS-level routines add storage:

— Without much regard to efficiency

— Although they will join adjacent grants into one extent (thus small sequenextents may become one)

Source locations:

• msfs/bs/bs_stg.c

• msfs/bs/bs_sbm.c

FAS-Level Storage AllocationOne concern of FAS-level storage is page-write clustering. If the file is being written sequentially, data space is preallocated in page sizes of:

• MIN (pg_to_write/4, MAX_PREALLOC_PAGES)

• pg_to_write is present page number

• MAX_PREALLOC_PAGES is presently 16

If this fails, data space is allocated as needed. BAS level will combine adjaceallocations.

Source location of fs_read_write_stg() is msfs/fs/fs_read_write.c.

4-16 AdvFS System Calls and Kernel Interfaces

Page 271: Dunix Student

Providing Storage Management

Truncating BitfilesAdvFS preallocates disk space to prevent multiple trips to the SBM information and to promote large extents. If the file write does not demand all preallocated disk pages, file truncation and possibly fragmentation will take place upon file close:

• When bitfile closes, AdvFS determines if last page should be allocated in the fragment file.

• If necessary:

— A fragment is allocated.

— Last page is now unused.

• If there are unused pages at end of file:

— Unused pages are deallocated.

— This can result in the release of small disk areas.

Source locations:

• fs_create_frag() in msfs/fs/fs_file_sets.c for file fragmentation.

• bf_setup_truncation() in msfs/fs/fs_create.c for file truncation.

AdvFS System Calls and Kernel Interfaces 4-17

Page 272: Dunix Student

Cloning

Cloning

OverviewCloning a fileset creates a read-only, pseudo copy (snapshot) of a file system usually for the purpose of online backups. This section discusses some routines used in cloning.

• Creating a clone

• Writing to a cloned original

• Reading from a clone

• Deleting bitfile from cloned original

Creating a CloneThis section introduces several routines involved in clone creation. The fs_fset_clone() routine performs various access checks.

The following example shows the fs_fset_clone() argument list.

Example 4-10: Prototype for fs_fset_clone()

/* * fs_fset_clone * * Creates a clone file set of an ’original’ file set. */

statusTfs_fset_clone( char *domain, /* in - name of set’s domain */ char *origSetName, /* in - name of orig set */ char *cloneSetName, /* in - name of new clone set */ bfSetIdT *retCloneBfSetId, /* out - clone set’s id */ long xid /* in - CFS transaction id */ )

The bs_bfs_clone ():

• Creates new bitfile-set.

• Copies original’s tagfile to clone’s tagfile.

• Makes appropriate modifications to bitfile-set attributes record.

Files open when cloning may not have perfect snapshots.

Source locations:

• fs_fset_clone() in msfs/fs/fs_file_sets.c.

• bs_bfs_clone()in msfs/bs/bs_bitfile_sets.c.

4-18 AdvFS System Calls and Kernel Interfaces

Page 273: Dunix Student

Cloning

Writing to a Cloned OriginalThis section lists the steps needed to handle an altered original:

• Bitfile pages of original are copy-on-write.

• On first modification of bitfile:

— New mcell is allocated for clone bitfile.

— Original and clone primary mcells are now different.

• On first modification of bitfile page:

— New extent is allocated for clone bitfile.

— Original data is copied to clone’s extent.

— Clone extent map has holes for original data.

• Source location in msfs/bs/bs_bitfile_sets.c:

— bs_cow_pg()

— bs_cow ()

— clone ()

Reading from a CloneUse the following sequence to read from a clone.

1. Determine if clone bitfile has requested page.

2. If not:

a. Determine if page really is within range of clone bitfile.

b. Check extent maps of original bitfile for page.

3. If a page is written into a hole of the original, the clone must be given a “permanent hole” extent.

Note that AdvFS optimizes I/O to the original bitfile-set, not to the clone.

AdvFS System Calls and Kernel Interfaces 4-19

Page 274: Dunix Student

Cloning

Deleting Bitfile from Cloned OriginalSequence needed to delete a file from the original fileset with the clone in existence:

1. Ensure data is available for clone after deletion from original fileset.

2. Original fileset is marked delete with clone.

It exists until clone fileset is deleted.

This is not the same as unlinking a file from fileset.

FAS-level understands multiple links for one file.

Source location: msfs/bs/bs_delete.c

Deleting a BitfileThe sequence for deleting a file is:

1. Set bitfile attributes state to BSRA_DELETING.

2. Delete the bitfile from the tagfile.

3. Add bitfile to deferred-delete list (DDL) for disk.

If system crashes, on recovery DDL is processed.

4. Wait for bitfile to close to reap the storage.

There are some variations of this process.

Source location: msfs/bs/bs_delete.c

Closing a Deleted BitfileThe final steps of the deletion happen when the file is closed:

1. Carefully delete the storage.

2. Perform a series of root transactions.

a. Pin several pages of SBM.

b. Update the storage bitmap to delete extents.

c. Update the delRst field of bitfile’s extent map to point to next extent todelete.

The disk storage delete code is found in del_dealloc_stg() and del_xtnt_array() in msfs/bs/bs_delete.c.

The mcell chain delete code is found in bmt_free_bf_mcells() in msfs/bs/bs_bmt.util.c.

Carefully delete the bitfile’s mcell chain.

4-20 AdvFS System Calls and Kernel Interfaces

Page 275: Dunix Student

Cloning

3. Perform a series of continued transactions.

a. Pin several pages of BMT.

b. Free the mcells on those pages.

c. Start a continuation transaction which knows next mcell to delete.

AdvFS System Calls and Kernel Interfaces 4-21

Page 276: Dunix Student

Migrating Files and Deleting Filesets

Migrating Files and Deleting Filesets

Overview This section describes the sequence of routines to accomplish file migration and fileset deletion. File migration takes place when the migrate command is used or the defragment command is used.

• Migrating a bitfile

• Deleting a fileset

Migrating a BitfileUse the following sequence to migrate a file:

1. Allocate new target storage.

2. Place target on deferred delete list.

If system crashes, it is gone on recovery.

3. Put target storage on copy extent map list.

Modifications will go to both source and target.

4. Copy blocks, source to target.

5. Flush blocks.

6. Switch roles on target and source.

Source will be reclaimed.

Source location: msfs/bs/bs_migrate.c

Deleting a FilesetUse the following sequence to delete a fileset:

1. Add bitfile-set to domain’s delete pending list.

2. Iterate through the tags of the bitfile-set.

3. Delete each bitfile.

4. Remove bitfile-set from bitfile-set delete pending list.

5. Delete tagfile.

Source location: bs_bfs_delete() is in msfs/bs/bs_bitfile_sets.c

4-22 AdvFS System Calls and Kernel Interfaces

Page 277: Dunix Student

Documenting Threads

Documenting Threads

OverviewThis section documents several kernel threads which are active behind the scenes to keep AdvFS in sync. Kernel threads are found under PID 0.

• AdvFS threads

• Fragment bitfile thread

• I/O thread

• AdvFS cleanup thread

AdvFS ThreadsThe following are some common characteristics of the AdvFS threads:

• Are created by kernel idle thread routine (PID 0)

• Receive typed messages on queue

• Block with cond_wait()

Source location: msfs/bs/bs_msg_queue.c

Fragment Bitfile ThreadFragment groups are trimmed periodically by the fragment bitfile thread (one per system):

• Deallocates frag groups of type 0 when there are too many

Target is AdvfsMinFragGrps (default is 16)

• Is awakened from frag_group_dalloc() with message containing bitfile-set ID

Kernel thread routine is bs_fragbf_thread() in msfs/bs/bs_bitfile_sets.c.

Default value for free fragment groups is 16. The value can be changed with sysconfigtab.

AdvFS System Calls and Kernel Interfaces 4-23

Page 278: Dunix Student

Documenting Threads

ge

I/O ThreadThis thread monitors its message queue for requests to trigger I/O:

• For START_MORE_IO messages:

— Calls bs_startio() for a virtual disk

— Is awakened by bs_osf_complete() when queue is small

• For LF_PB_CONT messages:

— Checks if a log flush continue or a pin block continue is needed

— Is awakened by bs_io_complete() if HiFlushLSN has changed

Source locations:

• msfs/bs/bs_qio.c for bs_io_thread()

• msfs/osf/msfs_io.c for bs_osf_complete()

AdvFS Cleanup ThreadVarious system routines communicate with this kernel thread using its messaqueue:

• For FINSH_DIR_TRUNC messages:

— Truncates space from directory

— Is awakened by routines to insert directory entries

• For CLEANUP_CLOSED_LIST messages:

— Moves bfAccess structures from closed to free list

— Awakened by routines which allocate bfAccess structures

Source location: msfs/fs/fs_dir_init.c

4-24 AdvFS System Calls and Kernel Interfaces

Page 279: Dunix Student

Summary

Summary

Describing Entries to AdvFSThe VFS switch table defines a series of pointers to functions implementing many of the file system level activities.

The vnode switch table defines a series of pointers to functions implementing most of the file-oriented AdvFS activities.

Data read from an AdvFS file system ultimately ends up in a memory location that is associated with the unified buffer cache.

AdvFS I/O must eventually cause device activity to begin. The device activity is triggered by device driver routines.

Starting Up and Recovering in AdvFSUse bs_bfset_activate_int() to activate or find a domain structure.

Use bs_bfdmn_tbl_activate()to search for virtual disks.

Use bs_bfdmn_activate() to activate the domain.

Use ftx_bfdmn_recovery() to recover a domain with one of three recovery passes:

Providing Storage Management Some SBM information is cached in memory data structures:

• Disk free storage list

• BAS-level routines add storage

One concern of FAS-level storage is page-write clustering. If the file is being written sequentially, data is preallocated in page sizes of:

• MIN (pg_to_write/4, MAX_PREALLOC_PAGES)

• pg_to_write is present page number

• MAX_PREALLOC_PAGES is presently 16

If this fails, data is allocated as needed. BAS-level will combine adjacent allocations.

AdvFS System Calls and Kernel Interfaces 4-25

Page 280: Dunix Student

Summary

ent

CloningCloning a fileset creates a read-only, pseudo copy (snapshot) of a file system usually for the purpose of online backups.

The fs_fset_clone() routine performs various access checks. The bs_bfs_clone ():

• Creates new bitfile-set.

• Copies original’s tagfile to clone’s tagfile.

• Makes appropriate modifications to bitfile-set attributes record.

Files open when cloning may not have perfect snapshots.

Migrating Files and Deleting FilesetsFile migration takes place when the migrate command is used or the defragmcommand is used. To migrate a file:

1. Allocate new target storage.

2. Place target on deferred delete list.

3. Put target storage on copy extent map list.

4. Copy blocks, source to target.

5. Flush blocks.

6. Switch roles on target and source.

To delete a fileset:

1. Add bitfile-set to domain’s delete pending list.

2. Iterate through the tags of the bitfile-set.

3. Delete each bitfile.

4. Remove bitfile-set from bitfile-set delete pending list.

5. Delete tagfile.

Documenting ThreadsAdvFS threads are created by the kernel thread routine.

• Fragment bitfile thread: One per system; deallocates frag groups of type 0

• I/O thread: For START_MORE_IO and LF_PB_CONT messages

• FS cleanup thread: For FINSH_DIR_TRUNC and CLEANUP_CLOSED_LIST message

4-26 AdvFS System Calls and Kernel Interfaces

Page 281: Dunix Student

Exercises

Exercises

Labs for this section involve reading any of the source code specified in the student materials (if it is available). The instructor will suggest several routines.

AdvFS System Calls and Kernel Interfaces 4-27

Page 282: Dunix Student

Solutions

Solutions

If you are not confident using the C programming language, this code reading can be done as a group. Routine fs_fset_clone() found in fs_file_sets.c may be a good starting point.

4-28 AdvFS System Calls and Kernel Interfaces

Page 283: Dunix Student

5

Troubleshooting AdvFS

Troubleshooting AdvFS 5-1

Page 284: Dunix Student

About This Chapter

About This Chapter

IntroductionThis chapter presents some AdvFS tips and hints on troubleshooting. It also provides a discussion of some known AdvFS issues/problems as well as a number of actual troubleshooting case studies.

ObjectivesTo troubleshoot AdvFS you should be able to:

• Identify commands, tools, and practices to isolate the problem.

• Examine a sample problem and identify possible solutions.

ResourcesFor more information on topics in this chapter as well as related topics, see the following:

• Advanced File System Administration

• Tru64 UNIX System Configuration and Tuning

• Tru64 UNIX AdvFS Reference pages

Case Study FormatThe case studies found in this chapter are presented using the following format:

• Problem statement

• Configuration

• Problem description

• Analysis

• Things attempted

• Final solution/summary

5-2 Troubleshooting AdvFS

Page 285: Dunix Student

Describing AdvFS Troubleshooting Practices

Describing AdvFS Troubleshooting Practices

AdvFS Commands and UtilitiesThis table provides a summary of AdvFS troubleshooting-related commands. See the AdvFS commands in the Appendix book for more detailed information on these commands.

Table 5-1: AdvFS Commands and Utilities

Command Function

addvol Adds a volume to an existing file domain.

advfsstat Displays performance statistics.

advscan Locates AdvFS volumes (disk partitions or LSM disk groups) that are in AdvFS domains.

balance Balances the percentage of used space among volumes in a domain.

chfile Changes attributes of an AdvFS file.

chfsets Changes fileset quotas (file usage limits and block usage limits).

chvol Changes attributes of a volume in an active domain.

defragment Makes the files on a disk more contiguous.

migrate Moves a file or file pages to another volume in an AdvFS domain.

mkfdmn Creates a new AdvFS file domain.

mkfset Creates an AdvFS fileset.

mountlist Checks for mounted AdvFS filesets.

ncheck Lists i-number or tag and path for all files in a file system.

nvbmtpg Displays pages of AdvFS bitfile metadata table (BMT) file; new command in Tru64 UNIX V5.0.

nvfragpg Displays the pages of an AdvFS fragment file; new command in Tru64 UNIX V 5.0.

nvlogpg Displays the log file of an AdvFS file domain; new command in Tru64 UNIX V 5.0.

nvsbmpg Displays a page of the storage bitmap (SBM) file; new command in Tru64 UNIX V 5.0.

nvtagpg Displays a page formatted as a tag file page; new command in Tru64 UNIX V 5.0.

rmfdmn Removes a file domain.

rmfset Removes a fileset or a clone fileset from an AdvFS file domain.

rmvol Removes a volume from an existing file domain.

salvage Recovers file data from damaged AdvFS file domains; new command in Tru64 UNIX Version 5.0 release. Versions are available for previous releases of Tru64 UNIX.

shblk Displays unformatted disk blocks.

shfragbf Displays how much space is used on the fragment file.

Troubleshooting AdvFS 5-3

Page 286: Dunix Student

Describing AdvFS Troubleshooting Practices

Troubleshooting Tips and PracticesHere are some general troubleshooting practices and procedures you should consider when investigating an AdvFS problem.

Describe the problem (and any relevant circumstances) in as much detail as possible. Include in this description the answers to these types of questions:

• How often has the problem occurred?

• Is the problem reproducible?

• When was the last time this feature worked properly?

showfdmn Displays the attributes of a file domain and detailed information about each volume in the file domain.

showfile Displays the attributes of AdvFS directories and files.

showfsets Displays information about filesets in an AdvFS domain.

stripe Stripes a file across several volumes in a file domain.

switchlog Moves an AdvFS file domain transaction log.

tag2name Displays the path name of a file given the tag number.

vbmtchain Displays metadata for a file including the time-stamp, extent map, and whether the file is a user directory or data file.

vbmtpg Displays a complete, formatted page of the BMT for a mounted or unmounted domain.

vdf Displays disk information for AdvFS domains and filesets; new command in the Tru64 UNIX Version 5.0 release.

vdump, rvdump

Performs full and incremental backups on filesets.

verify Checks on-disk structures such as the BMT, storage bitmaps, tag directory, and the fragment file for each fileset; included in Tru64 UNIX Version 4.0 and higher.

vfile Displays the contents of a file from an unmounted domain.

vfilepg Displays pages of an AdvFS file.

vfragpg Displays a single header page of a fragment file.

vlogpg Translates a 16-block part of a volume of an unmounted file system and formats it as a log page.

vlsnpg Displays the logical sequence number (LSN) of a page of the log.

vrestore, rvrestore

Restores files from savesets produced by vdump and rvdump.

vsbmpg Displays a page from a storage bitmap (SBM) file.

vtagpg Displays a formatted page of a file.

Table 5-1: AdvFS Commands and Utilities (Continued)

Command Function

5-4 Troubleshooting AdvFS

Page 287: Dunix Student

Describing AdvFS Troubleshooting Practices

• What steps led to the problem?

• Is there anything else that is not behaving normally (whether related or not)?

• What (if any) parts of the system that might be relevant are working as expected?

• Was anything happening physically close to the system at the time the problem appeared? For example, was a cable unintentionally disconnected?

• Has anything changed either in the hardware or software configuration? For example, changes in hardware, software/firmware updates, patches, or installations?

• If any changes have taken place, can the configuration be restored to its original state to determine whether the problem is still present? (Note: changes should be made sparingly and in as logical a sequence as possible to reduce the number of possible factors influencing or potentially masking the problem.)

Check for hardware-related causes of the problem.

• Any obvious problems with cables, connectors or terminators. For example, are the cables too long? Are there any bent pins? Is the hardware seated properly?

• Check hardware and firmware revision levels (sys_check mentioned below will look for these).

• Does the problem move with the hardware?

Check these locations for any error messages:

• binary.errlog

• /var/adm/syslog.dated/datekern.log

• /var/adm/syslog.dated/datedaemon.log

• /var/adm/messages

The /var/adm/syslog.dated/<date>kern.log and the /var/adm/messages files are written to by the syslogd daemon. Many errors encountered on the system that are not hardware related will be written to the syslog facility and, depending upon the configuration of the /etc/syslog.conf file, will go to one or all of these logs. In addition to the kern.log, the daemon.log is another log used heavily by specialists troubleshooting ASE problems. kern.log and messages will normally contain more messages related to AdvFS-specific issues (for example, AdvFS I/O errors).

• Check the advfs_err(4) reference page to find a brief description based on an error number.

Troubleshooting AdvFS 5-5

Page 288: Dunix Student

Describing AdvFS Troubleshooting Practices

ws n its ery

tain T.

• Search CANASTA if a panic is involved.

CANASTA is a Compaq internal crash dump analysis tool being used world-wide inside Compaq to store and evaluate crash footprint information for OpenVMS Alpha, OpenVMS VAX and Tru64 UNIX system crashes.

CANASTA uses AI technology to provide solutions or additional troubleshooting information for system crash problems. The CANASTA tool is typically used in the Customer Service Center (CSC), but access to the CANASTA knowledge database is also available using the CANASTA Mail Server, TIMA STARS and COMET.

By using the AutoCLUE tool, customer crash dump information can be automatically sent to Compaq using DSNlink and will be analyzed using the DSNlink CLUE post-processor. Solution information, if available, can be automatically returned to the customer and/or included in the call handling system.

• If you think it might be a bug in the software, research the reported bugs and patches for potential similarities.

Use the following sources to look for any existing information:

— The Atlanta CSC UNIX Web page has links to many useful sources.

— COMET search for past cases: Integrated Problem Management Tool(IPMT)/QAR entries, blitzes and notes conferences.

— COMET is an intelligent web-based storage and retrieval tool which allousers to search large collections of documents. It differs from STARS isearch algorithms and available information. A prime feature, Smart Qudecides which subset of COMET's 500 databases is most likely to conthe information you want. An account is not required to access COME

— Search of blitzes database:

— Notes conferences:

— Patch READMEs

— AdvFS/LSM Focals

— AdvFS/LSM Manuals

— SPD

• Use system tools to check for problems.

— sys_check

sys_check is a useful ksh script that can help to debug or diagnose system problems. The script generates an HTML file of a Tru64 UNIX configuration. This script has been tested on DIGITAL UNIX Version 3.2*, and Version 4.0 systems.

5-6 Troubleshooting AdvFS

Page 289: Dunix Student

Describing AdvFS Troubleshooting Practices

— iostat

— Performance Manager

• Use AdvFS tools and utilities to check for and fix problems.

Troubleshooting AdvFS 5-7

Page 290: Dunix Student

Troubleshooting File System Corruption

Troubleshooting File System Corruption

OverviewThis section presents some general information regarding troubleshooting AdvFS file system corruption problems.

Generally, fixing an AdvFS corruption problem depends on what caused the corruption. It is important to analyze an AdvFS corruption problem to determine the root cause.

Recognizing File System CorruptionCustomers rarely have a problem recognizing and reporting file system corruption. Some symptoms of file system corruption a customer might report include:

• System panic

• Domain panic

• Corrupted data

• Unexpected behavior after entering ordinary commands on files in an AdvFS file system

Causes of AdvFS CorruptionAdvFS corruption is usually caused by one of the following reasons:

• Hardware problem

Hardware problems are the most common sources of AdvFS-related system panics. The most frequent cause of corruption in any file system is bad blocks on the physical disk. Another common cause is outdated firmware revisions.

• Uncontrolled system shutdown

AdvFS is generally robust enough to withstand unexpected system crashes or power outages, but may still cause corruption in certain cases.

• Software bugs in the AdvFS software

Software bugs can often be reproduced. AdvFS software bugs are usually fixed by patches. Any available, relevant patches should be applied in the initial stages of troubleshooting a problem. Available resources should be checked for relevant patches since it is not always obvious which patches might be relevant to AdvFS.

5-8 Troubleshooting AdvFS

Page 291: Dunix Student

Troubleshooting File System Corruption

No Valid File System Error Message Possible causes of this symptom include:

• Corrupted metadata, BMT, or transaction log possibly caused by a disk, controller or other hardware problem

• Software bug

Possible troubleshooting actions include:

• If there was a panic, search CANASTA.

• Check the binary errorlog for bad block replacements or other I/O errors. If excessive, ensure the hardware problem is resolved before taking any other action.

• Run sys_check.

• Repair and/or restore from backup.

• Repair domain structures using the verify utility for DIGITAL UNIX Version 4.x systems. (Use the msfsck and the vchkdir command for DIGITAL UNIX Version 3.x systems.)

— If the verify utility does not solve the problem, attempt to recover thefileset data from backup media.

— Only if both methods are unsatisfactory should you employ the salvage utility. salvage is a new utility available in Tru64 UNIX Version 5.0. (Afield test version of salvage is available in DIGITAL UNIX Version 4.0D.)

• If applicable, ensure that the Logical Storage Manager (LSM) is started bychecking for the vold daemon. If necessary, start LSM.

• Check the links in /etc/fdmns/domainname directory for correctness.

• File an IPMT.

Mount File System Operation Crashes the SystemPossible causes of this symptom include:

• Corrupted metadata, BMT, or transaction log. Possibly caused by disk, controller, or other hardware problem

• Software bug

Troubleshooting AdvFS 5-9

Page 292: Dunix Student

Troubleshooting File System Corruption

k

Possible troubleshooting actions include:

• Analyze the system crash dump for insight towards determining the next troubleshooting step. Search CANASTA.

• Run sys_check.

• Check the binary errorlog for bad block replacements or other I/O errors. If excessive, ensure the hardware problem is resolved before taking any other action.

• Attempt to execute the mount -d or mount -r commands to mount the file system.

This technique has been useful because it allows you to get a backup of the file system. The specialist should use caution when using either flag. The -d flag disables transaction logging. The -r flag mounts the file system with read-only access.

• Repair and/or restore from backup media.

— Repair domain structures using the verify utility for DIGITAL UNIX Version 4.x systems. However, since verify will attempt to mount the filesets, a system panic will most likely occur. (Use the msfsck and the vchkdir commands for DIGITAL UNIX Version 3.x systems.)

— If the verify utility does not solve the problem, attempt to recover thefileset data from backup media.

— Only if both methods are unsatisfactory should you employ the salvage utility. salvage is a new utility available in Tru64 UNIX Version 5.0.

— File an IPMT.

Localized CorruptionLocalized corruption is often a situation that is tolerable. The verify utility may be useful in cases of local corruption.

Symptoms of localized corruption include:

• Normal file manipulations for a few files on the file system that do not worproperly.

• Customer notices AdvFS I/O errors in the messages or kern.log files.

• No problems mounting file system.

5-10 Troubleshooting AdvFS

Page 293: Dunix Student

Troubleshooting File System Corruption

tion. e

Possible causes include:

• Corrupted directory or files possibly caused by bad blocks

• CPU exceptions or other hardware problems.

• Uncontrolled system shutdown such as power failure or crash.

• Software bug.

Possible troubleshooting actions include:

• Check the binary errorlog for bad block replacements or other hardware events.

• If excessive, ensure the hardware problem is resolved before taking any other action.

• If the corruption is not increasing and remains localized, add a new volume to replace the volume experiencing the errors.

• Repair and/or restore from backup media.

— Repair domain structures using the verify utility for DIGITAL UNIX Version 4.x systems. However, since verify will attempt to mount the filesets, a system panic will most likely occur. (Use the msfsck and the vchkdir commands for DIGITAL UNIX Version 3.x systems.)

— If the verify utility does not solve the problem, attempt to recover thefileset data from backup media.

— Only if both methods are unsatisfactory should you employ the salvage utility. salvage is a new utility available in Tru64 UNIX Version 5.0.

• File an IPMT.

Generalized CorruptionGeneralized corruption is often a more serious situation than localized corrupThe verify utility is not generally useful in fixing generalized corruption. Timspent using verify on generalized corruption problems may be better spent running salvage or restoring from backup.

Symptoms of generalized corruption include:

• Normal file manipulations for many or all the files on a file system.

• Numerous AdvFS I/O errors in the messages or kern.log files.

• No problem mounting the filesets in the domain.

Troubleshooting AdvFS 5-11

Page 294: Dunix Student

Troubleshooting File System Corruption

Possible causes include:

• Corrupted metadata or fragment list possibly caused by bad blocks, CPU exceptions or other hardware problemS.

• Uncontrolled system shutdown such as power failure or crash.

• Software bug.

Possible troubleshooting actions include:

• Check the binary errorlog for bad block replacements or other hardware events. If excessive, ensure the hardware problem is resolved before taking any other action.

• You can try adding volumes and removing the volumes having problems. In the case of general corruption, this will probably not solve the problem. This process is time consuming with a large number of bad files.

Repair and/or restore from backup media.

• Repair domain structures using the verify utility. However, since verify will attempt to mount the filesets, a system panic will most likely occur. (Use the msfsck and the vchkdir commands for DIGITAL UNIX Version 3.x systems.)

• If the verify utility does not solve the problem, attempt to recover the fileset data from backup media.

• Only if both methods are unsatisfactory should you employ the salvage utility. salvage is a new utility available in Tru64 UNIX Version 5.0.

• File an IPMT.

Domain PanicA domain panic has occurred when the domain goes offline and data is inaccessible to users of the system.

Possible causes include:

• Corrupted metadata, BMT, fragment list or transaction log possibly caused by bad blocks, CPU exceptions or other hardware problemS.

• The filesets in the domain will not mount.

• Software bug.

Possible troubleshooting actions:

• Check the binary errorlog for bad block replacements or other I/O errors. If excessive, ensure the hardware problem is resolved before any other action.

• Use the mount -d command to try to get data off when restoring any volumes that have I/O errors.

5-12 Troubleshooting AdvFS

Page 295: Dunix Student

Troubleshooting File System Corruption

Repair and/or restore from backup media.

• Repair domain structures using the verify utility for DIGITAL UNIX Version 4.x systems. However, since verify will attempt to mount the filesets, a system panic will most likely occur and verify will be unsuccessful in fixing the problem. (Use the msfsck and the vchkdir commands for DIGITAL UNIX Version 3.x systems.)

• If the verify utility does not solve the problem, attempt to recover the fileset data from backup media.

• Only if both methods are unsatisfactory should you employ the salvage utility. salvage is a new utility available in Tru64 UNIX Version 5.0.

• File an IPMT.

Troubleshooting AdvFS 5-13

Page 296: Dunix Student

Resolving Known AdvFS Issues

Resolving Known AdvFS Issues

OverviewThis section highlights a few known AdvFS issues or problems and strategies to resolve or work around these issues.

Log Half-Full ProblemUnder some circumstances, in pre-version 5.0 releases of Tru64 UNIX, AdvFS can panic with the log half full error message.

The following situations are known to have a causal relationship with an AdvFS panic with the log half full error message:

• When a very large file truncate is performed (this can occur when a file is overwritten by another file or by an explicit truncate system call), and the fileset containing the file has a clone fileset.

• When very large, highly fragmented files are migrated. Files with greater than 40000 extents are at risk. A migrate operation is performed when running the defragment, balance, rmvol, or migrate AdvFS utilities.

Regardless of the cause, the problem can be addressed by either reducing file fragmentation or by increasing the size of the log.

Fixing Log Half-Full Problems: Reducing FragmentationFile fragmentation can be reduced by following these steps:

1. Performing a backup of the files, deleting them, and restoring

2. Running the defragment utility on the files.

Determining Appropriate Log SizeUse these guidelines to determine an appropriate log size.

Enter this showfile command to determine the number of extents:

showfile -x filename | grep extentCnt

Table 5-2: Log Size Guidelines

Number of Extents Recommended Log Size

40000 768

60000 1024

80000 1280

5-14 Troubleshooting AdvFS

Page 297: Dunix Student

Resolving Known AdvFS Issues

Fixing Log Half Full Problems: Increasing Log Size Using switchlogIf you have a spare partition, you can:

1. Add a spare volume to the domain.

2. Move the log to that volume.

3. Move the log back with an increase in size.

4. Remove the spare volume.

If you have a spare partition, follow these steps to increase the log size:

1. Enter the addvol command specifying the block device special file name of the disk that you are adding to the file domain and the domain name.

# addvol /dev/rz10b domain <== V4.x# addvol /dev/disk/dsk10b domain <== V5.x

2. Enter the showfdmn command specifying the domain name.

# showfdmn domain

The showfdmn command will display information similar to the following:

Id Date Created LogPgs Domain Name 31b8a083.00049136 Fri Jun 7 17:34:59 1996 512 small Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name 1 401408 0 100% on 128 128 /dev/rz11b 2L 262144 192 100% on 128 128 /dev/rz3b 3 393216 0 0% on 128 128 /dev/rz10b ---------- ---------- ------ 1056768 192 100%

3. Enter the switchlog command specifying the name of the domain and the number of the new volume to use for the log.

# switchlog domain 3

4. Enter the switchlog command specifying a larger log size with the -l option and the number of the volume to use for the log. (The -l option is undocumented.) This command essentially moves the log back with a larger log size.

# switchlog -l 1024 domain 2

5. Enter the following rmvol command to remove the spare volume.

# rmvol /dev/rz10b domain

Troubleshooting AdvFS 5-15

Page 298: Dunix Student

Resolving Known AdvFS Issues

Fixing Log Half Full Problems: Increasing Log Size Using mkfdmnAlternatively, you can set the log page size using the mkfdmn command as follows:

mkfdmn -l pages

The default number of pages for the log is 512.

The -l option is an enhancement to the AdvFS mkfdmn command included in DIGITAL UNIX Version 4.0d. If you use the mkfdmn command, the domain will be reinitialized and must be restored from backups. The log half full problem should be solved by using the switchlog command.

BMT ExhaustionIn DIGITAL UNIX Version 4.0c and earlier only, AdvFS file systems that:

• Consist primarily of small files (less than 8KB) and

• Regularly create and delete very large numbers (many hundreds) of these small files

can run out of metadata space (inode tables), causing misleading out of disk space errors. Internet news servers and mail servers are particularly prone to this problem.

In DIGITAL UNIX Version 4.0d, use space reservation to work around the BMT exhaustion problem. This problem is fixed in Tru64 UNIX Version 5.0.

Avoiding BMT ExhaustionBMT exhaustion is a problem only in DIGITAL UNIX Version 4.0C and earlier. The problem can still occur in DIGITAL UNIX Version 4.0D, but it is less likely to be due to space reservation. Preallocating the metadata immediately after the file domain is created and/or additional volumes are added avoids BMT exhaustion.

For DIGITAL UNIX Version 3.x, this preallocation can be accomplished by writing a script that creates and then deletes the estimated number of files that are expected to exist in the AdvFS domain. The files created by the script are empty.

For example, the following ksh script preallocates metadata for 1000 files:

integer f=1000 while ((f > 0)) do touch prealloc_$f f=f-1 done rm prealloc_*

5-16 Troubleshooting AdvFS

Page 299: Dunix Student

Resolving Known AdvFS Issues

For DIGITAL UNIX Version 4.x, this preallocation can be accomplished by using the -x and -p switches to the mkfdmn and addvol commands. This increases the number of file systems blocks (8K) to extend or preallocate the bitfile metadata table (BMT) respectively.

For example, the following steps show how the -x switch can create a new AdvFS file system containing two volumes in which both will extend their BMT by 2048 pages at a time.

1. Enter the mkfdmn command. Specifying the -o option overwrites an existing file domain, allowing you to recreate the domain structure. Specifying the -x option lets you set the number of pages by which the bitfile metadata table extent size grows. The default is 128 pages.

# mkfdmn -o -x 2048 /dev/vol/vol08 test_dmn

2. Enter the mkfset command to create a new fileset in the specified domain.

# mkfset test_dmn test

3. Enter the addvol command to add a volume to the specified domain. Using the -x option, you can set the number of pages (extent size) by which the bitfile metadata table grows. The default is 128 pages.

# addvol -x 2048 /dev/vol/vol09 test_dmn

For example, the following steps show how to use the -p switch to create a new AdvFS file system containing one volume in which the volume’s BMT is preallocated by 10240 pages.

1. Enter the mkfdmn command. Specifying the -p option lets you set the number of pages by which the bitfile metadata table is preallocated. There is no default.

# mkfdmn -p 10240 /dev/vol/vol06 test_domain

2. Enter the mkfset command to create a new fileset in the specified domain.

# mkfset test_domain test

For example, the following steps show how the -x and -p switches can be used together to create a new AdvFS file system containing one volume in which the volume’s BMT is preallocated by 4096 pages and will extend by 1024 pages.

1. Enter the mkfdmn command as follows:

# mkfdmn -p 4096 -x 1024 /dev/vol/vol06 test_domain

Troubleshooting AdvFS 5-17

Page 300: Dunix Student

Resolving Known AdvFS Issues

2. Enter the mkfset command to create a new fileset in the specified domain.

# mkfset test_domain test

BMT Extent Map AllocationsThe following table provides a general idea for different cases, based on the values given to mkfdmn or addvol of -x and -p, of what the extent map should look like, assuming the domain does not become fragmented before filling out the BMT extent map.

If the domain becomes fragmented before filling out the BMT extent map, the size of the extents (at some point between three and 683 extents) will diminish and be smaller than the default (or the value specified using the -x switch). It is not possible to predict those sizes because the file system will attempt to find the largest available hole to hold the extent. The size of this hole depends on the file system fragmentation at the time the system attempts to find the extent.

BMT Exhaustion: Fixing the Problem Two common task or command sequences for fixing a BMT exhaustion problem include:

1. Backup, mkfdmn/mkfset, restore from backup

This method generally restores files and metadata in the domain contiguously. In addition, you can preallocate metadata when making the file domain or adding volumes so that you can be sure that the BMT is contiguous. This strategy requires some down time for that domain while the operation is in progress. For a large domain, this could be a long period of time.

You could use the DIGITAL UNIX Version 3.x method of preallocating files before restoring. In DIGITAL UNIX Version 4.x, you could use the -x and -p flags.

2. addvol, rmvol

The addvol command adds another new volume to the domain with a fresh BMT that files will be migrated to.

Table 5-3: BMT Extent Map Allocations

Extent Default -x 1024 -p 20000 -x 2048 -p30000

1 2 2 2 2

2 128 1024 20000 30000

3 128 1024 128 2048

4 128 1024 128 2048

... ... ... ... ...

... ... ... ... ...

683 128 1024 128 2048

5-18 Troubleshooting AdvFS

Page 301: Dunix Student

Resolving Known AdvFS Issues

The addvol command also allows you to use the -x and -p options to alter the defaults for BMT. Using the -x flag, you can set the number of pages by which the BMT extent size grows. Using the -p flag, you can set the number of pages to preallocate for the BMT.

The rmvol command automatically migrates files and metadata to other volumes in the domain.

The result is similar to restoring from backups in that the metadata is written in a contiguous fashion, and is therefore defragmented. Defragmented metadata generally will not become exhausted as quickly.

These commands can be executed while the domain is online, minimizing the impact on the users of the domain.

Additional disk space is needed for the new volume. In an existing multivolume domain, you must add a volume equivalent in size to the largest volume in that domain.

Troubleshooting AdvFS 5-19

Page 302: Dunix Student

Case Study 1: RBMT Corruption

Case Study 1: RBMT Corruption

OverviewThis case study describes a RBMT corruption problem.

Problem Statement: Case Study 1The problem statement received from the customer was:

"A multivolume AdvFS domain will not mount."

Configuration: Case Study 1The configuration the customer experienced the problem with consisted of a standalone DECstation 255 running Tru64 UNIX V5.0. The disks were both internal to the machine and in a BA353 storage unit.

Problem Description: Case Study 1The customer claimed that he was unable to mount file systems within a particular domain (bruden_dom). Other domains were functioning normally. No filesets within the afflicted domain would mount. One of the file systems had an existing clone.

Domain panic messages and mail were repeating, as were console messages indicating domain panics.

AnalysisHere is the analysis of the problem.

1. Since other domains were functioning normally, the specialist deduced that the problem was localized to the bruden_dom volumes. The error log showed no recent bad block replacements.

# mount bruden_dom#bruce_fset /usr/brucebruden_dom#bruce_fset on /usr/bruce: I/O error# mount -d bruden_dom#bruce_fset /usr/brucebruden_dom#bruce_fset on /usr/bruce: I/O error

2. Attempt to gather information about the domain.

# ls -l /etc/fdmns/bruden_domtotal 0lrwxr-xr-x 1 root system 15 Sep 28 16:59 dsk0a -> /dev/disk/dsk0alrwxr-xr-x 1 root system 15 Sep 28 17:01 dsk0b -> /dev/disk/dsk0blrwxr-xr-x 1 root system 15 Sep 28 17:08 dsk2h -> /dev/disk/dsk2h # showfsets bruden_dombruce_fset

Id : 37f12c39.000263ea.1.8001

5-20 Troubleshooting AdvFS

Page 303: Dunix Student

Case Study 1: RBMT Corruption

Files : 6, SLim= 0, HLim= 0Blocks (512) : 68288, SLim= 50000, HLim= 200000 grc= noneQuota Status : user=off group=off

dennis_fsetId : 37f12c39.000263ea.2.8001Clone is : den_cloneFiles : 324, SLim= 0, HLim= 0Blocks (512) : 93114, SLim= 0, HLim= 400000Quota Status : user=off group=off

den_cloneId : 37f12c39.000263ea.3.8003Clone of : dennis_fsetRevision : 3

3. Check for and evaluate any logged error messages.

4. Try verify to check for and correct any corruption.

# verify -f bruden_domverify: can’t get set info for domain ’bruden_dom’verify: error = E_BAD_BMT (-1171)verify: can’t allocate memory for fileset mount_point arrayUnable to malloc an additional 0 bytes, currently using 0exiting...

5. verify indicates a problem in the BMT. Let’s see if salvage can give us some help or insight. salvage (ultimately) causes further domain panics.

# salvage bruden_domsalvage: Domain to be recovered ’bruden_dom’salvage: Volumes to be used ’/dev/disk/dsk0a’ ’/dev/disk/dsk0b’ ’/dev/disk/dsk2h’ salvage: Files will be restored to ’.’salvage: Logfile will be placed in ’./salvage.log’salvage: Starting search of all filesets: 13-Oct-1999 10:38:26salvage: Starting search of all volumes: 13-Oct-1999 10:38:26salvage: Loading file names for all filesets: 13-Oct-1999 10:38:26salvage: Starting recovery of all filesets: 13-Oct-1999 10:38:27you have mail in /usr/spool/mail/root # # mailFrom root Wed Oct 13 10:36:06 1999Received: by den255 id KAA01133; Wed, 13 Oct 1999 10:36:05 -0400 (EDT)Date: Wed, 13 Oct 1999 10:36:05 -0400 (EDT)From: system PRIVILEGED account <root>Message-Id: <199910131436.KAA01133@den255>Subject: EVM ALERT [600]: AdvFS: An AdvFS domain panic has occurred on bruden_dom

============================ EVM Log event ===========================EVM event name: sys.unix.fs.advfs.fdmn.panic

Troubleshooting AdvFS 5-21

Page 304: Dunix Student

Case Study 1: RBMT Corruption

This event is posted by the AdvFS filesystem to indicate that an AdvFS domain panic has occurred on the specified domain. This is due to either a metadata write error or an internal inconsistency. The domain is being rendered inaccessible.

Action: Please refer to the guidelines in the AdvFS Guide to File System Administration for the steps to recover this domain.

======================================================================

Formatted Message: AdvFS: An AdvFS domain panic has occurred on bruden_dom

Event Data Items: Event Name : sys.unix.fs.advfs.fdmn.panic Cluster Event : True Priority : 600 PID : 1131 PPID : 664 Event Id : 171 Member Id : 0 Timestamp : 13-Oct-1999 10:35:04 Host IP address : 192.206.126.27 Host Name : den255 Format : AdvFS: An AdvFS domain panic has occurred on $domain Reference : cat:evmexp.cat:450

Variable Items: domain = "bruden_dom"

6. The show commands confirm the BMT corruption.

# showfdmn bruden_domshowfdmn: unable to get info for domain ’bruden_dom’showfdmn: error = E_BAD_BMT (-1171) # # showfsets bruden_domshowfsets: can’t show set info for domain ’bruden_dom’showfsets: error = E_BAD_BMT (-1171)

7. Investigate potential known bugs and patches.

8. Elevate the problem to engineering. However, keep trying to solve the problem for the customer. See if the on-disk viewing tools can help.

# nvbmtpg -rR bruden_domBad mcell record 0: bCnt too large 63265get_stripe_parms: Bad mcell RBMT vol 1 page 0 cell 4: No BSR_XTNTS record in primary mcell.

5-22 Troubleshooting AdvFS

Page 305: Dunix Student

Case Study 1: RBMT Corruption

MT.

9. Block 32 of the volume should contain the RBMT (see the on-disk module).

Since we can’t seem to get anywhere, let’s look at the beginning of the RBThe first mcell should map the RBMT itself.

# vfilepg /dev/disk/dsk0a -b 32 Bad mcell record 0: bCnt too large 63265get_stripe_parms: Bad mcell RBMT vol 1 page 0 cell 4: No BSR_XTNTS record in primary mcell.==========================================================================VOLUME "/dev/disk/dsk0a" (VDI 1) lbn 32 --------------------------------------------------------------------------004000 08 00 00 00 00 00 00 00 13 00 00 00 00 00 00 20 ............... 004010 06 00 00 00 01 00 00 00 fa ff ff ff 00 00 00 00 ................004020 fe ff ff ff 00 00 00 00 5c 00 02 00 03 00 00 00 ........\.......004030 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004040 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 ................004050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004080 00 00 00 00 50 00 01 00 00 00 00 00 00 00 00 00 ....P...........004090 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 ................0040a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0040b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0040c0 01 00 02 00 00 00 00 00 00 00 00 00 01 00 00 00 ................0040d0 ff ff ff ff 04 00 00 00 00 00 00 00 00 00 00 00 ................0040e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0040f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004130 00 00 00 00 00 00 00 00 00 00 00 00 f9 ff ff ff ................004140 00 00 00 00 fe ff ff ff 00 00 00 00 5c 00 02 00 ............\...004150 03 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 ................004160 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 ................004170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041a0 00 00 00 00 00 00 00 00 50 00 01 00 00 00 00 00 ........P.......0041b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041c0 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041e0 00 00 00 00 01 00 02 00 00 00 00 00 70 00 00 00 ............p...0041f0 01 00 00 00 ff ff ff ff 04 00 00 00 00 00 00 00 ................

10. The event manager log shows many AdvFS domain panic messages.

# evmget | evmshowSystem startupASCII msg: Test for EVM connection of binlogdSystem timestampSystem shutdown msg: System halted by root: System startupASCII msg: Test for EVM connection of binlogdSystem timestamp

Troubleshooting AdvFS 5-23

Page 306: Dunix Student

Case Study 1: RBMT Corruption

System startupASCII msg: Test for EVM connection of binlogdSystem timestampAdvFS domain panicAdvFS domain panicAdvFS domain panicAdvFS domain panic

...

11. This page can be interpreted if we remember that each BMT (and RBMT) page starts with a 16-byte header followed by a series of mcells containing a variable number of records. Each mcell is 292 bytes and contains within it a 24-byte header. Try to determine what is wrong by looking at a good domain’s RBMT.

# showfdmn usr_domain

Id Date Created LogPgs Version Domain Name37da6652.0009f8d0 Sat Sep 11 10:25:22 1999 512 4 usr_domain

Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name 1L 1426112 201120 86% on 256 256 /dev/disk/dsk1g# # # vfilepg /dev/rdisk/dsk1g -b 32 ==========================================================================VOLUME "/dev/rdisk/dsk1g" (VDI 1) lbn 32 --------------------------------------------------------------------------004000 08 00 00 00 00 00 00 00 13 00 00 00 00 00 00 20 ............... 004010 06 00 00 00 01 00 00 00 fa ff ff ff 00 00 00 00 ................004020 fe ff ff ff 00 00 00 00 5c 00 02 00 03 00 00 00 ........\.......004030 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004040 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 ................004050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004080 00 00 00 00 50 00 01 00 00 00 00 00 00 00 00 00 ....P...........004090 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 ................0040a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0040b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0040c0 01 00 02 00 00 00 00 00 20 00 00 00 01 00 00 00 ........ .......0040d0 ff ff ff ff 04 00 00 00 00 00 00 00 00 00 00 00 ................0040e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0040f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004130 00 00 00 00 00 00 00 00 00 00 00 00 f9 ff ff ff ................004140 00 00 00 00 fe ff ff ff 00 00 00 00 5c 00 02 00 ............\...004150 03 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 ................004160 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 ................004170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041a0 00 00 00 00 00 00 00 00 50 00 01 00 00 00 00 00 ........P.......

5-24 Troubleshooting AdvFS

Page 307: Dunix Student

Case Study 1: RBMT Corruption

0041b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041c0 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041e0 00 00 00 00 01 00 02 00 00 00 00 00 70 00 00 00 ............p...0041f0 02 00 00 00 ff ff ff ff 04 00 00 00 00 00 00 00 ................# # nvbmtpg -rR usr_domain 1 0 ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk1g) lbn 32 RBMT page 0--------------------------------------------------------------------------CELL 0 next mcell volume page cell 1 0 6 bfSetTag,tag -2,-6(RBMT)

RECORD 0 bCnt 92 BSR_ATTRtype BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTStype BSXMT_APPEND chain mcell volume page cell 0 0 0firstXtnt mcellCnt 1 xCnt 2bsXA[ 0] bsPage 0 vdBlk 32 (0x20)bsXA[ 1] bsPage 1 vdBlk -1

--------------------------------------------------------------------------CELL 1 next mcell volume page cell 0 0 0 bfSetTag,tag -2,-7 (SBM)

RECORD 0 bCnt 92 BSR_ATTRtype BSRA_VALID

RECORD 1 bCnt 80 BSR_XTNTStype BSXMT_APPEND chain mcell volume page cell 0 0 0firstXtnt mcellCnt 1 xCnt 2bsXA[ 0] bsPage 0 vdBlk 112 (0x70)bsXA[ 1] bsPage 2 vdBlk -1

(...)

12. Determine the difference between the two page 32s.

# vfilepg /dev/rdisk/dsk1g -b 32 > /tmp/dsk1g# # vfilepg /dev/rdisk/dsk0a -b 32 > /tmp/dsk0aget_stripe_parms: Bad mcell RBMT vol 1 page 0 cell 4: No BSR_XTNTS record in primary mcell.

# diff dsk0a dsk1g1d0< Bad mcell record 0: bCnt too large 632653c2< VOLUME "/dev/rdisk/dsk0a" (VDI 1) lbn 32 ---> VOLUME "/dev/rdisk/dsk1g" (VDI 1) lbn 32 17c16< 0040c0 01 00 02 00 00 00 00 00 00 00 00 00 01 00 00 00 ................---> 0040c0 01 00 02 00 00 00 00 00 20 00 00 00 01 00 00 00 ........ .......36c35

Troubleshooting AdvFS 5-25

Page 308: Dunix Student

Case Study 1: RBMT Corruption

as e. was lem.

< 0041f0 01 00 00 00 ff ff ff ff 04 00 00 00 00 00 00 00 ................---> 0041f0 02 00 00 00 ff ff ff ff 04 00 00 00 00 00 00 00 ................#

13. The byte containing the hex 00 (bolded) should contain a hex 20. The customer agreed to try a fix to the suspected corrupted byte. The specialist wrote a program to insert a hex 20 (decimal 32) in the raw volume file at the correct location. This turned out to be the LBN field of the RBMT’s extent field. It wsupposed to contain a 32 indicating block 32 is where to find the RBMT filThe corrupted RBMT had a 00 where it should have had a 20 (hex). The fixtried and worked. Disk corruption was ultimately determined to be the prob

# cat putbyte.c#include <stdio.h>#include <stdlib.h>#include <fcntl.h>#include <sys/stat.h>#include <sys/types.h>#include <unistd.h>

#define READ_COUNT 512

int main(void){

int fd, ret, count = READ_COUNT, i = 0;int offset = 32*512;long off = 0;long targ = 0;char buf[READ_COUNT];ssize_t size;

fd = open("/dev/rdisk/dsk0a",O_RDWR);

if(fd == -1){

perror("open problem");exit(EXIT_FAILURE);

}

off = lseek(fd,offset, SEEK_SET);

if(off == -1){

perror("seek problem");exit(EXIT_FAILURE);

}

printf("offset (off) is %d, %x\n",off,off);

size = read(fd, buf, count);

if(size == -1)

5-26 Troubleshooting AdvFS

Page 309: Dunix Student

Case Study 1: RBMT Corruption

{perror("read trouble");exit(EXIT_FAILURE);

}

while( i < 512){

printf("%02x ",buf[i++]);if((i%16) == 0) printf("\n");

} printf("byte to change in hex is %04x.\n",*(buf+200));

*(buf+200) = (char)0x20;

printf("Changed byte is %04x.\n",*(buf+200));

off = lseek(fd,offset, SEEK_SET);

ret = write(fd, buf, READ_COUNT);if(ret == -1){

perror("read trouble");exit(EXIT_FAILURE);

}}# # # cc -o putbyte putbyte.c# # # ./putbyteoffset (off) is 16384, 400008 00 00 00 00 00 00 00 13 00 00 00 00 00 00 20 06 00 00 00 01 00 00 00 fffffffa ffffffff ffffffff ffffffff 00 00 00 00 fffffffe ffffffff ffffffff ffffffff 00 00 00 00 5c 00 02 00 03 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 02 00 00 00 00 00 00 00 00 00 01 00 00 00 ffffffff ffffffff ffffffff ffffffff 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fffffff9 ffffffff ffffffff ffffffff 00 00 00 00 fffffffe ffffffff ffffffff ffffffff 00 00 00 00 5c 00 02 00

Troubleshooting AdvFS 5-27

Page 310: Dunix Student

Case Study 1: RBMT Corruption

03 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 02 00 00 00 00 00 70 00 00 00 01 00 00 00 ffffffff ffffffff ffffffff ffffffff 04 00 00 00 00 00 00 00 byte to change in hex is 0000.Changed byte is 0020.# # # vfilepg /dev/rdisk/dsk1g -b 32 > /tmp/dsk1g# # vfilepg /dev/rdisk/dsk0a -b 32 > /tmp/dsk0a# # # diff dsk0a dsk1g2c2< VOLUME "/dev/rdisk/dsk0a" (VDI 1) lbn 32 ---> VOLUME "/dev/rdisk/dsk1g" (VDI 1) lbn 32 35c35< 0041f0 01 00 00 00 ff ff ff ff 04 00 00 00 00 00 00 00 ................---> 0041f0 02 00 00 00 ff ff ff ff 04 00 00 00 00 00 00 00 ................# # # mount bruden_dom#bruce_fset /usr/bruce# # # dfFilesystem 512-blocks Used Available Capacity Mounted on/dev/disk/dsk1a 644808 268766 311560 47% //proc 0 0 0 100% /procusr_domain#usr 1426112 1026492 200464 84% /usrusr_domain#var 1426112 169516 200464 46% /varbruden_dom#bruce_fset 50000 22788 27212 46% /usr/bruce# # # cd /usr/bruce# # # ls.tags big5 quota.user sm2big4 quota.group sm1# # # ls -ltotal 11402drwx------ 2 root system 8192 Sep 28 17:04 .tags-rwxr-xr-x 1 root system 11646960 Sep 28 17:21 big4

5-28 Troubleshooting AdvFS

Page 311: Dunix Student

Case Study 1: RBMT Corruption

-rwxr-xr-x 1 obrien system 0 Sep 28 17:24 big5-rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group-rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user-rw-r--r-- 1 root system 5 Oct 13 10:26 sm1-rw-r--r-- 1 root system 10 Oct 13 10:27 sm2# # # nvbmtpg -rR bruden_dom ==========================================================================DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 32 RBMT page 0--------------------------------------------------------------------------There is 1 page in the RBMT on this volume.There are 19 free mcells in the RBMT on this volume.==========================================================================DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 32 RBMT page 0--------------------------------------------------------------------------There is 1 page in the RBMT on this volume.There are 19 free mcells in the RBMT on this volume.==========================================================================DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 32 RBMT page 0--------------------------------------------------------------------------There is 1 page in the RBMT on this volume.There are 19 free mcells in the RBMT on this volume.#

Troubleshooting AdvFS 5-29

Page 312: Dunix Student

Case Study 2: Fragment-Free List Corruption

er the in.

Case Study 2: Fragment-Free List Corruption

OverviewThis case study describes a 1K fragment-free list corruption problem.

Problem Statement: Case Study 2The problem statement received from the customer was:

"Following the creation of a new file on an existing AdvFS domain, it is noticed that many other files on the same domain now contain the same data."

Configuration: Case Study 2The configuration on which the customer experienced the problem consisted of an AlphaServer 2100 running DIGITAL UNIX Version 3.2d.

Problem Description: Case Study 2Customer creates a new file on one of the AdvFS domains, and many other files on the domain show up containing the same data. The problem is reproducible on the customer’s system.

AnalysisHere is the analysis of the problem:

1. The syslog and binlog files were checked for any hardware or system problems that may have lead to the corruption. None were found.

2. Perform some testing on the customer’s system and analyze results.

The specialist created different files of various sizes and checked to see whethdata for those files was equivalent to some number of other files on the doma

For example, the file sal already existed in the domain (file sal was small and recently created). The specialist created a new file called jim in the same domain and entered the data JUNK1234 in the file. Listing the contents of the file sal showed it now contained the same data that had been entered for file jim.

eagles [207] # cat > jimJUNK1234eagles [208] # cat salJUNK1234

5-30 Troubleshooting AdvFS

Page 313: Dunix Student

Case Study 2: Fragment-Free List Corruption

Entering the ls -li command for both files lists these characteristics:

• i-number

• Access rights

• Size (in bytes)

• Owner

• Group

• Time of last modification for each file

• File name

The modification of the contents of file sal to be identical to the contents of the newly created file jim was not recorded as the last file modification.

eagles [209] # ls -li jim sal 623 -rw-r--r-- 1 root dba 8 Oct 2 12:34 jim 567 -rw-r--r-- 1 sal dba 8 Sep 24 18:31 sal

The testing indicated that the problem manifested itself only for new, small files (<1K) created on the domain. In addition, there were multiple filesets in the domain, all of which exhibited the same behavior.

Since only new files were manifesting the problem, this pointed to a recent corruption. The size of the files being affected pointed to a problem with the allocation of small (1K) fragments from the fragment-free list for this domain.

3. Enter the AdvFS showfile command with the -x qualifier specifying both files.

The showfile command displays the attributes of one or more AdvFS files. The -x qualifier displays the full storage allocation map (extent map) for the specified files. See the AdvFS Commands Appendix for more information on this command.

eagles [212] # showfile -x jim sal Id Vol PgSz Pages XtntType Segs SegSz Log Perf File 26f.8002 2 16 0 simple ** ** off 100% jim extentMap: 1 pageOff pageCnt vol volBlock blockCnt extentCnt: 0 Id Vol PgSz Pages XtntType Segs SegSz Log Perf File 237.8017 3 16 0 simple ** ** off 100% sal extentMap: 1 pageOff pageCnt vol volBlock blockCnt extentCnt: 0

A value of 0 for the number of extents (extentCnt) for both files indicates that neither small file has any extents. AdvFS writes files to disk in sets of 8KB pages. When a file uses only part of the last page, less than 8KB, a file fragment is created.

Troubleshooting AdvFS 5-31

Page 314: Dunix Student

Case Study 2: Fragment-Free List Corruption

The fragment, which is from 1KB to 7KB in size, is allocated from the fragment file. Using fragments reduces the amount of unused, wasted disk space. The fragment file is a special file not visible in the directory hierarchy.

Given the size of these files, we expect that AdvFS allocated space for them from the 1K fragment-free list.

4. Enter the AdvFS shfragbf command.

The shfragbf command displays how much space is used on the fragment file. See the AdvFS Commands Appendix for more information on this command.

eagles [235] # /usr/field/shfragbf -t 1 -v /decsave/.tags/1 |more--group pg = 0, next pg = -1, type = 1 nextFree = 43, numFree = 18frag on free list : 43, nextFree = 43frag on free list : 43, nextFree = 43frag on free list : 43, nextFree = 43frag on free list : 43, nextFree = 43

...

The output of the shfragbf command repeated the same values from this point onward. Therefore, any new fragment allocated from the 1K fragment-free list is receiving the same block (43) every time. Files that have already been allocated block 43 will all be pointing to the same fragment in the same block. Changing the data located at block 43 therefore causes an update of the file content for all the files pointing to that block.

Issuing the shfragbf command displays output similar to:

# /usr/field/shfragbf -t 1 -v /.tags/1--group pg = 64, next pg = 112, type = 1 nextFree = 66, numFree = 34frag on free list : 578, nextFree = 601frag on free list : 601, nextFree = 602frag on free list : 602, nextFree = 603frag on free list : 603, nextFree = 604[snip]frag on free list : 624, nextFree = 623frag on free list : 623, nextFree = -1

5. The msfsck and vchkdir tools were run against the domain and reported no other domain corruption.

5-32 Troubleshooting AdvFS

Page 315: Dunix Student

Case Study 2: Fragment-Free List Corruption

mple, thout e

no ain and

Things Attempted: Case Study 21. A database search was done to see if the problem was seen before. A match was

not found.

2. The latest patches were applied to the system. Patches do not generally clean up corruption problems, but will hopefully prevent future occurrences of known issues.

Final Solution/Summary: Case Study 2The customer had to remake the domain and restore from backups. This was a localized corruption that was not correctable with available tools.

The customer’s highest priority was to return to production operation with thedomain. The customer was less interested that the problem be fixed (for exathat a patch be developed and/or problem further analyzed by engineering) wihaving to remake the domain, than they were in receiving some assurance thproblem would not immediately reoccur due to some underlying hardware or general problem.

The problem was analyzed to the point where it was fairly certain there were hardware issues and the customer could be reassured that recreating the domrestoring the files would produce a clean file system.

The underlying cause of the problem was never determined.

Troubleshooting AdvFS 5-33

Page 316: Dunix Student

Case Study 3: Corruption and System Panic

Case Study 3: Corruption and System Panic

OverviewThis case study describes an AdvFS file system corruption problem and resultant system panic.

Problem Statement: Case Study 3The problem statement received from the customer was:

"AdvFS file systems become corrupted following upgrade from DIGITAL UNIX Version 3.2g to DIGITAL UNIX Version 4.0a".

Configuration: Case Study 3The configuration on which the customer experienced the problem consisted of an AlphaServer 2100 running DIGITAL UNIX Version 4.0a and the Prestoserve option. Prestoserve is a nonvolatile hardware cache used to speed up synchronous access to file systems. This option consists of a hardware card and a software driver and utilities. The system was the main NFS server for the campus.

Non-Compaq (Seagate) disk drives were used in the configuration.

Problem Description: Case Study 3After an upgrade from DIGITAL UNIX Version 3.2g to DIGITAL UNIX Version 4.0a, the system periodically corrupts files on AdvFS domains.

The customer became aware of the problem when the system reported file not found errors in response to an ls command on known, existing files.

The nature of the problem statement did not help isolate which subsystem (AdvFS, Presto, or NFS) might have been causing the corruption.

Since the customer reported CPU exceptions and Compaq disk drives were not in the configuration, hardware could also have been contributing to the problem. The disk drives were also suspected of causing problems with the device driver since the method for detecting devices on the system had changed dramatically (for example, Dynamic Device Recognition - DDR). The DDR interface will make assumptions about some of the parameters of the disk drives. In the case of the Seagate drives, the question was whether the assumption being made about Tagged Command Queueing may not have been correct, or that the feature was not correctly implemented on the drive. The firmware provided for Compaq disk drives ensures that they always work in a specific manner. We cannot be certain of the firmware implementation in disk drives not used by Compaq.

5-34 Troubleshooting AdvFS

Page 317: Dunix Student

Case Study 3: Corruption and System Panic

Analysis1. The logs were checked.

Hardware CPU exceptions were reported in the binary errlog.

2. Performed some testing to investigate the corruption.

a. Enter the ls -l command for a known file on /usr10.

# ls -l /usr10/vogar/cs2005/poly2/usr10/vogar/cs2005/poly2 not found

The results indicate the file is not found.

b. Enter the mount | grep command for /usr10 to find the domain name.

# mount |grep /usr10disk7#usr10 on /usr10 type advfs (rw, quota)

The results tell us that the domain name is disk7.

c. Enter the ls -l command for the /etc/fdmns directory specifying the disk7 domain.

# ls -l /etc/fdmns/disk7total 0lrwxrwxrwx 1 root wheel 10 Oct 21 10:26 rz12c -> /dev/rz12c

The results provide additional information about the problem domain. At this We found that this was not a Compaq device (Seagate drive) and therefore possibly a device driver issue. Bad blocks on the disk are the most common cause for corruption, so look for those on this disk.

3. Possible hardware causes of the corruption were investigated.

The CPU exceptions were examined and a CPU was replaced.

The previous CPU exceptions could have caused corruption in the AdvFS structures.

4. The vdump command was used to perform a full backup. The vrestore command was used to restore the files from the savesets produced by vdump.

The full vdump and vrestore was performed to ensure data integrity after the corruption.

5. Run the verify command specifying the -d flag (to remove the corrupted files) on the domain.

The verify command checks on-disk structures such as the bitfile metadata table (BMT), the storage bitmaps, the tag directory and the fragment file for each fileset. In this case, using the -d option, also temporarily cleared the corruption.

Troubleshooting AdvFS 5-35

Page 318: Dunix Student

Case Study 3: Corruption and System Panic

The corrupted files that were originally present were deleted by verify. However, additional files subsequently became corrupted on several different domains within a few hours after running at DIGITAL UNIX Version 4.0a with the Prestoserve option enabled.

At one point, the domains became so corrupted that attempting to mount them or repair the corruption resulted in a system panic.

6. Contact engineering to reassess the problem.

The problem was entered into the Integrated Problem Management Tool (IPMT), a Web-based tracking tool for escalating customer problems to USEG.

An AdvFS file system corruption was confirmed. Using the verify command to correct the problem was only partially successful. At this point it was still not determined which subsystem the problem was in. The panics came from AdvFS, but that was a symptom and did not point to the source of corruption. Crash dumps from the system panics were obtained and provided to engineering.

The crash dumps that occurred were all of the same type, ultimately helping to identify and reproduce the problem.

The following is a representative crash-data file for the panics:

## Crash Data Collection (Version 1.4)#_crash_data_collection_time: Sat Nov 9 12:34:16 EST 1996_current_directory: /_crash_kernel: /var/adm/crash/vmunix.12_crash_core: /var/adm/crash/vmcore.12_crash_arch: alpha_crash_os: Digital UNIX_host_version: Digital UNIX V4.0A (Rev. 464); Fri Oct 25 18:54:14 EDT 1996_crash_version: Digital UNIX V4.0A (Rev. 464); Fri Oct 25 18:54:14 EDT 1996thread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source no_crashtime: struct { tv_sec = 847560441 tv_usec = 435137}_boottime: struct { tv_sec = 847559468 tv_usec = 118564}_config: struct { sysname = "OSF1" nodename = "res.WPI.EDU" release = "V4.0" version = "464" machine = "alpha"}_cpu: 35_system_string: 0xffffffffff8010b8 = "AlphaServer 2100 4/200"_ncpus: 3_avail_cpus: 3

5-36 Troubleshooting AdvFS

Page 319: Dunix Student

Case Study 3: Corruption and System Panic

_partial_dump: 1_physmem(MBytes): 319_panic_string: 0xffffffff94ccb328 = "bad v1 frag free list"_paniccpu: 0_panic_thread: 0xfffffc0000e0a840_preserved_message_buffer_begin:struct { msg_magic = 0x63061 msg_bufx = 0xb84 msg_bufr = 0xa3e msg_bufc = "Alpha boot: available memory from 0xc88000 to 0x13ffe000Digital UNIX V4.0A (Rev. 464); Fri Oct 25 18:54:14 EDT 1996physical memory = 320.00 megabytes.available memory = 307.56 megabytes.using 1221 buffers containing 9.53 megabytes of memoryMaster cpu at slot 0.Firmware revision: 4.6PALcode: OSF version 1.45ibus0 at nexusAlphaServer 2100 4/200cpu 0 EV-4s 1mb b-cachecpu 1 EV-4s 1mb b-cachecpu 2 EV-4s 1mb b-cachegpc0 at ibus0pci0 at ibus0 slot 0tu0: DECchip 21040-AA: Revision: 2.2tu0 at pci0 slot 0tu0: DEC TULIP Ethernet Interface, hardware address: 08-00-2B-E2-65-1Ctu0: console mode: selecting 10BaseT (UTP) port: half duplex: no linkpsiop0 at pci0 slot 1Loading SIOP: script 1000000, reg 81000000, data 100df38scsi0 at psiop0 slot 0rz0 at scsi0 target 0 lun 0 (LID=0) (DEC RZ28 (C) DEC D41C)rz1 at scsi0 target 1 lun 0 (LID=1) (SEAGATE ST32550N 0012)rz2 at scsi0 target 2 lun 0 (LID=2) (SEAGATE ST15150N 0017)rz3 at scsi0 target 3 lun 0 (LID=3) (SEAGATE ST15150N 0017)eisa0 at pci0ace0 at eisa0ace1 at eisa0lp0 at eisa0fdi0 at eisa0fd0 at fdi0 unit 0fd1 at fdi0 unit 1qvision0 at eisa0qvision0: CMPQ Qvision 1024/E SVGAtu1: DECchip 21140-AA: Revision: 1.2tu1 at pci0 slot 6tu1: DEC Fast Ethernet Interface, hardware address: 00-00-F8-02-8B-0Etu1: console mode: selecting 100BaseTX (UTP) port: full duplexpnvram0: Module Revision 16, Cache size: 8387584pnvram0 at pci0 slot 7pnvram_ssn: NO System Serial Numberpresto: NVRAM tested readonly okpresto: using 8387584 bytes of NVRAM at 0xc0000400presto: primary battery ok

Troubleshooting AdvFS 5-37

Page 320: Dunix Student

Case Study 3: Corruption and System Panic

psiop1 at pci0 slot 8Loading SIOP: script 102c000, reg 81000100, data 40608338scsi1 at psiop1 slot 0rz8 at scsi1 target 0 lun 0 (LID=4) (SEAGATE ST12550N 0013)rz9 at scsi1 target 1 lun 0 (LID=5) (SEAGATE ST12550N 0013)rz10 at scsi1 target 2 lun 0 (LID=6) (SEAGATE ST12550N 0013)rz11 at scsi1 target 3 lun 0 (LID=7) (DEC RRD45 (C) DEC 0436)rz12 at scsi1 target 4 lun 0 (LID=8) (SEAGATE ST15150N 0017)tz14 at scsi1 target 6 lun 0 (LID=9) (DEC TLZ06 (C)DEC 4BQE)lvm0: configured.lvm1: configured.kernel console: qvision0dli: configuredADVFS: using 2907 buffers containing 22.71 megabytes of memoryvm_swap_init: warning /sbin/swapdefault swap device not foundvm_swap_init: swap is set to lazy (over commitment) modeStarting secondary cpu 1Starting secondary cpu 2SuperLAT. Copyright 1994 Meridian Technology Corp. All rights reserved.tu1: transmit FIFO underflow: threshold raised to: 256 bytesrfs_dispatch: sendreply failedrfs_dispatch: sendreply failedADVFS EXCEPTIONModule = bs_bitfile_sets.c, Line = 1511bad v1 frag free listpanic (cpu 0): bad v1 frag free listsyncing disks... 14 donedevice string for dump = SCSI 0 1 0 0 0 0 0.DUMP.prom: dev SCSI 0 1 0 0 0 0 0, block 131072device string for dump = SCSI 0 1 0 0 0 0 0.DUMP.prom: dev SCSI 0 1 0 0 0 0 0, block 131072"}_preserved_message_buffer_end:_kernel_process_status_begin: PID COMM00000 kernel idle00001 init00003 kloadsrv00039 update00127 syslogd00129 binlogd00316 portmap00324 ypbind00333 mountd00335 nfsd00337 nfsiod00340 rpc.statd00342 rpc.lockd00439 prestoctl_svc00454 sendmail00473 xntpd00506 snmpd00507 inetd00509 os_mibs

5-38 Troubleshooting AdvFS

Page 321: Dunix Student

Case Study 3: Corruption and System Panic

00544 cron00570 lpd00579 lpd00580 rlogind00581 tcsh00583 smbd00587 nmbd00622 pwd_server00627 httpsd00637 httpsd00639 httpsd00640 httpsd00641 httpsd00642 httpsd00643 httpsd00658 erpcd00670 rarpd00684 asplmd.exe00705 dtlogin00718 getty00721 Xdec00726 dtlogin00780 dxconsole00781 dtgreet00808 rlogind00809 tcsh00842 httpsd00881 tcsh00883 lynx00887 httpsd00892 httpsd00898 telnetd00899 tcsh00918 pine00999 rpc.ttdbserverd01053 httpsd01057 httpsd01188 httpsd01201 httpsd01207 httpsd01236 httpsd01264 httpsd01342 httpsd01343 httpsd01344 httpsd01354 vi_kernel_process_status_end:_current_pid: 0_current_tid: 0xfffffc0000e0a840_proc_thread_list_begin:thread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source nothread 0xfffffc0000e0ab00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0b080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0b340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0b600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]

Troubleshooting AdvFS 5-39

Page 322: Dunix Student

Case Study 3: Corruption and System Panic

thread 0xfffffc0000e0b8c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0bb80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc000412a000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc000412a2c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0a580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0a2c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0a000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1bb80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b8c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1adc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1ab00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1a840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1a580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1a2c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1a000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529b80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc00125298c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528dc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528b00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc00125282c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641b80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc00046418c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c99080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98dc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98b00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c982c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107b80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106dc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106b00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106840 stopped at [stop_secondary_cpu:499,0xfffffc00004719thread 0xfffffc0012106580 stopped at [stop_secondary_cpu:499,0xfffffc00004719thread 0xfffffc00121062c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecfb80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecf8c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecf600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecf340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]

5-40 Troubleshooting AdvFS

Page 323: Dunix Student

Case Study 3: Corruption and System Panic

thread 0xfffffc0013ecf080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecedc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013eceb00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ece840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ece580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ece2c0 stopped at [thread_run:2438 ,0xfffffc00002a8898] So_proc_thread_list_end: warning: Files compiled -g3: parameter values probably wrong_dump_begin: 0 boot(0x0, 0xfffffc0000e0a840, 0x2c0000002c, 0x31, 0xfffffc0000000001) ["../ 1 panic(s = 0xffffffff94ccb328 = "bad v1 frag free list") ["../../../../src/kpcpu = (nil)i = 6121648mycpu = 0spl = 0 2 advfs_sad(0x76, 0x9000005, 0xfffffc0008260000, 0x13af6, 0xfffffc00005839d8) 3 bs_frag_alloc(setp = 0xfffffc00041e8c08, ftxH = struct { hndl = 5 level = 0 dmnh = 9}, fragId = 0xffffffff94ccb640) ["../../../../src/kernel/msfs/bs/bs_bitfile_setsfrag = 80630grpPg = 16fragPg = 4294967295fragHdrp = 0x76grpHdrp = 0xfffffc0008260000grpPgp = 0xfffffc0008260000fragPgp = 0x6setAttrp = (nil)pinPgH = struct { hndl = 396 dmnh = 0 pgHndl = 0}grpPgRef = struct { hndl = 5 dmnh = 9 pgHndl = 1}fragPgRef = struct { hndl = 5 dmnh = 0 pgHndl = 9} 4 fs_create_frag(0xffffffff804dc528, 0xfffffc0012c1fc00, 0x9000005,0xfffffc0 5 close_one_int(bfap = 0xffffffff804dc528, parentFtxH = struct { hndl = 50520 level = 77 dmnh = 128}) ["../../../../src/kernel/msfs/bs/bs_access.c":3422, 0xfffffc00002ed6ec]ftxH = struct { hndl = 5 level = 0

Troubleshooting AdvFS 5-41

Page 324: Dunix Student

Case Study 3: Corruption and System Panic

dmnh = 9}prevState = ACC_VALIDdelVdp = 0xfffffc00040ed008delList = 0xfffffc000031d884delCnt = 0delMCId = struct { cell = 12 page = 12}dmnp = 0xfffffc00040ed008cp = (nil)fragFlag = 1deleteIt = 0ftxFlag = 0vp = 0xfffffc00040ed008 6 close_int() ["../../../../src/kernel/msfs/bs/bs_access.c":3179,0xfffffc000bfap = 0xffffffff804dc528 7 bs_vfs_close(bfAccessH = 4716512) ["../../../../src/kernel/msfs/bs/bs_acces 8 msfs_inactive(0xfffffc0000357308, 0x12, 0xfffffc000042db44, 0xfffffc00110cf 9 vrele(vp = 0xfffffc00110cf200) ["../../../../src/kernel/vfs/vfs_subr.c":230 10 rfs3_writeg() ["../../../../src/kernel/nfs/nfs3_server.c":2827,0xfffffc000error = 0bverror = 59160000resp2 = 0xfffffc00110cf200piov = 0xfffffc000f555a48piovlen = 286061056psv = 0xffffffff80476db0minoffset = 8192maxoffset = 18446739675949101568first = 286061056imgathering = 0dupwrite = 60579840estale = 0prestohere = 0dummy = 0didmyreply = 0prevwr = 0x40000000000mywritelist = (nil)dev = 4716512pbp = 0xffffffff80476de8 11 rfs3_write(args = 0xfffffc00036a8e60, resp = 0xfffffc000f555a40, nreq = 0xfpsv = 0xffffffff80476db0vp = 0xfffffc000047f7e0 12 rfs_dispatch(req = (nil), xprt = 0xfffffc00015a5700) ["../../../../src/kerndisp = 0xfffffc00005a5d20error = 0nreq = 0xfffffc000f65a180args = 0xfffffc00036a8e60 = " "res = 0xfffffc000f555a40 = ""ep = 0xfffffc00044ee680

5-42 Troubleshooting AdvFS

Page 325: Dunix Student

Case Study 3: Corruption and System Panic

fh = 0xfffffc00015a5700which = 7ret = 0dupstat = 2args_translated = 1psv = 0xffffffff80476db0count = 0buff = "" 13 nfs_rpc_recv(0xa305dc1600000007, 0xfffffc0000000001, 0xfffffc0013631a40,0x 14 nfs_rpc_input(0xfffffc0013631a40, 0xfffffc0000000024, 0x0,0xfffffc0013631b 15 nfs_input(m = 0xfffffc00034b5300) ["../../../../src/kernel/nfs/nfs_server.cpsv = 0xffffffff80476db0xprt = 0xfffffc00015a5700savecr = 0xfffffc0013ecc100save_nd = 0xfffffc0013ecb158ip = 0xfffffc00015a5700uh = 0xfffffc0013ecc100ui = 0xffffffff80476db0len = 2n = 0x28000udp_in = struct { sin_len = ’^P’ sin_family = ’^B’ sin_port = 65027 sin_addr = struct { s_addr = 102291330 } sin_zero = ""} 16 nfs_thread() ["../../../../src/kernel/nfs/nfs_server.c":5714,0xfffffc00003m = 0xfffffc000254eb00thread = 0xfffffc0000e0a840psv = 0xffffffff80476db0_dump_end:_kernel_thread_list_begin:thread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source nothread 0xfffffc0000e0ab00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0b080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0b340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0b600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0b8c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0bb80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc000412a000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc000412a2c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0a580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0a2c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0a000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1bb80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b8c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1adc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1ab00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]

Troubleshooting AdvFS 5-43

Page 326: Dunix Student

Case Study 3: Corruption and System Panic

thread 0xfffffc0012d1a840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1a580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1a2c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1a000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529b80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc00125298c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528dc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528b00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc00125282c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641b80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc00046418c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c99080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98dc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98b00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c982c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107b80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106dc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106b00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106840 stopped at [stop_secondary_cpu:499,0xfffffc00004719thread 0xfffffc0012106580 stopped at [stop_secondary_cpu:499,0xfffffc00004719thread 0xfffffc00121062c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecfb80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecf8c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecf600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecf340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecf080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecedc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013eceb00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ece840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ece580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ece2c0 stopped at [thread_run:2438 ,0xfffffc00002a8898] So_kernel_thread_list_end:_savedefp: (nil)_kernel_memory_fault_data_begin:struct {

fault_va = 0x0 fault_pc = 0x0 fault_ra = 0x0

5-44 Troubleshooting AdvFS

Page 327: Dunix Student

Case Study 3: Corruption and System Panic

fault_sp = 0x0 access = 0x0 status = 0x0 cpunum = 0x0 count = 0x0 pcb = (nil) thread = (nil) task = (nil) proc = (nil)}_kernel_memory_fault_data_end:_uptime: .27 hoursthread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source nopaniccpu: 0x0machine_slot[paniccpu]: struct { is_cpu = 0x1 cpu_type = 0xf cpu_subtype = 0x9 running = 0x1 cpu_ticks = { [0] 0x4708 [1] 0x0 [2] 0x1998d [3] 0xd55ad [4] 0x273a } clock_freq = 0x400 error_restart = 0x0 cpu_panicstr = 0xffffffff94ccb328 = "bad v1 frag free list" cpu_panic_thread = 0xfffffc0000e0a840}tset machine_slot[paniccpu].cpu_panic_thread:Begin Trace for machine_slot[paniccpu].cpu_panic_thread:thread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source no> 0 boot(0x0, 0xfffffc0000e0a840, 0x2c0000002c, 0x31, 0xfffffc0000000001) ["../ 1 panic(s = 0xffffffff94ccb328 = "bad v1 frag free list") ["../../../../src/k 2 advfs_sad(0x76, 0x9000005, 0xfffffc0008260000, 0x13af6, 0xfffffc00005839d8) 3 bs_frag_alloc(setp = 0xfffffc00041e8c08, ftxH = struct { hndl = 0x5 level = 0x0 dmnh = 0x9}, fragId = 0xffffffff94ccb640) ["../../../../src/kernel/msfs/bs/bs_bitfile_sets 4 fs_create_frag(0xffffffff804dc528, 0xfffffc0012c1fc00, 0x9000005,0xfffffc0 5 close_one_int(bfap = 0xffffffff804dc528, parentFtxH = struct { hndl = 0xc558 level = 0x4d dmnh = 0x80}) ["../../../../src/kernel/msfs/bs/bs_access.c":3422, 0xfffffc00002ed6ec] 6 close_int() ["../../../../src/kernel/msfs/bs/bs_access.c":3179,0xfffffc000 7 bs_vfs_close(bfAccessH = 0x47f7e0) ["../../../../src/kernel/msfs/bs/bs_acce

8 msfs_inactive(0xfffffc0000357308, 0x12, 0xfffffc000042db44,0xfffffc00110cf

Troubleshooting AdvFS 5-45

Page 328: Dunix Student

Case Study 3: Corruption and System Panic

9 vrele(vp = 0xfffffc00110cf200) ["../../../../src/kernel/vfs/vfs_subr.c":230 10 rfs3_writeg() ["../../../../src/kernel/nfs/nfs3_server.c":2827,0xfffffc000 11 rfs3_write(args = 0xfffffc00036a8e60, resp = 0xfffffc000f555a40, nreq = 0xf 12 rfs_dispatch(req = (nil), xprt = 0xfffffc00015a5700) ["../../../../src/kern 13 nfs_rpc_recv(0xa305dc1600000007, 0xfffffc0000000001, 0xfffffc0013631a40,0x 14 nfs_rpc_input(0xfffffc0013631a40, 0xfffffc0000000024, 0x0,0xfffffc0013631b 15 nfs_input(m = 0xfffffc00034b5300) ["../../../../src/kernel/nfs/nfs_server.c 16 nfs_thread() ["../../../../src/kernel/nfs/nfs_server.c":5714,0xfffffc00003End Trace for machine_slot[paniccpu].cpu_panic_thread:thread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source no"cpu_data" is not an arraythread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source no_stack_trace[0]_begin:> 0 boot(0x0, 0xfffffc0000e0a840, 0x2c0000002c, 0x31, 0xfffffc0000000001) ["../ 1 panic(s = 0xffffffff94ccb328 = "bad v1 frag free list") ["../../../../src/k 2 advfs_sad(0x76, 0x9000005, 0xfffffc0008260000, 0x13af6, 0xfffffc00005839d8) 3 bs_frag_alloc(setp = 0xfffffc00041e8c08, ftxH = struct { hndl = 5 level = 0 dmnh = 9}, fragId = 0xffffffff94ccb640) ["../../../../src/kernel/msfs/bs/bs_bitfile_sets 4 fs_create_frag(0xffffffff804dc528, 0xfffffc0012c1fc00, 0x9000005,0xfffffc0 5 close_one_int(bfap = 0xffffffff804dc528, parentFtxH = struct { hndl = 50520 level = 77 dmnh = 128}) ["../../../../src/kernel/msfs/bs/bs_access.c":3422, 0xfffffc00002ed6ec] 6 close_int() ["../../../../src/kernel/msfs/bs/bs_access.c":3179,0xfffffc000 7 bs_vfs_close(bfAccessH = 4716512) ["../../../../src/kernel/msfs/bs/bs_acces 8 msfs_inactive(0xfffffc0000357308, 0x12, 0xfffffc000042db44,0xfffffc00110cf 9 vrele(vp = 0xfffffc00110cf200) ["../../../../src/kernel/vfs/vfs_subr.c":230 10 rfs3_writeg() ["../../../../src/kernel/nfs/nfs3_server.c":2827,0xfffffc000 11 rfs3_write(args = 0xfffffc00036a8e60, resp = 0xfffffc000f555a40, nreq = 0xf 12 rfs_dispatch(req = (nil), xprt = 0xfffffc00015a5700) ["../../../../src/kern 13 nfs_rpc_recv(0xa305dc1600000007, 0xfffffc0000000001, 0xfffffc0013631a40,0x 14 nfs_rpc_input(0xfffffc0013631a40, 0xfffffc0000000024, 0x0,0xfffffc0013631b 15 nfs_input(m = 0xfffffc00034b5300) ["../../../../src/kernel/nfs/nfs_server.c 16 nfs_thread() ["../../../../src/kernel/nfs/nfs_server.c":5714,0xfffffc00003_stack_trace[0]_end:thread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source no"cpu_data" is not an arraythread 0xfffffc0012106580 stopped at [stop_secondary_cpu:499,0xfffffc00004719warning: Files compiled -g3: parameter values probably wrong_stack_trace[1]_begin:> 0 stop_secondary_cpu(do_lwc = 1) ["../../../../src/kernel/arch/alpha/cpu.c":4 1 panic(s = 0xfffffc00005b1858 = "cpu_ip_intr: panic request") ["../../../../ 2 cpu_ip_intr() ["../../../../src/kernel/arch/alpha/cpu.c":629,0xfffffc00004

5-46 Troubleshooting AdvFS

Page 329: Dunix Student

Case Study 3: Corruption and System Panic

3 _XentInt(0x0, 0xfffffc00002a9c24, 0xfffffc00005d68b0, 0x3fff, 0x1) ["../../ 4 idle_thread() ["../../../../src/kernel/kern/sched_prim.c":3302,0xfffffc000 5 vm_page_tester() ["../../../../src/kernel/vm/vm_resident.c":1930,0xfffffc0_stack_trace[1]_end:thread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source no"cpu_data" is not an arraythread 0xfffffc0012106840 stopped at [stop_secondary_cpu:499,0xfffffc00004719warning: Files compiled -g3: parameter values probably wrong_stack_trace[2]_begin:> 0 stop_secondary_cpu(do_lwc = 1) ["../../../../src/kernel/arch/alphacpu.c":4 1 panic(s = 0xfffffc00005b1858 = "cpu_ip_intr: panic request") ["../../../../ 2 cpu_ip_intr() ["../../../../src/kernel/arch/alpha/cpu.c":629, 0xfffffc00004 3 _XentInt(0x0, 0xfffffc00002a9be0, 0xfffffc00005d68b0, 0xfffffc0000200f00, 0 4 idle_thread() ["../../../../src/kernel/kern/sched_prim.c":3290,0xfffffc000 5 vm_page_tester() ["../../../../src/kernel/vm/vm_resident.c":1987,0xfffffc0_stack_trace[2]_end:

The stack trace fell through the NFS, VFS, and AdvFS code. The panic therefore did not isolate one of these subsystems as the cause of the problem.

7. Attempt to isolate the source of the problem.

a. Since the problem was not present before any upgrade was performed, revert to a known working configuration (DIGITAL UNIX Version 3.2g). The customer was able to continue to run booted from a DIGITAL UNIX Version 3.2g system disk without the AdvFS file corruption.

Since the problem appeared after an upgrade from DIGITAL UNIX Version 3.2g to DIGITAL UNIX Version 4.0a, the problem was likely related to changes in source code between the two versions. At this point it was still unknown whether it was a DIGITAL UNIX, AdvFS, Presto, or NFS bug.

b. Time was spent with engineering to determine that the corruption and panics were not hardware related nor related to changes in the CAM driver.

c. A dd copy of the domain was copied from one of the corrupted domains to help in reproducing the corruption and panic.

d. The system was tested for corruption when running DIGITAL UNIX Version 4.0a with the Prestoserve option disabled to determine if the corruption was related to the Prestoserve option.

8. Elevate to engineering for the creation of a Prestoserve patch.

Once the problem was isolated to Prestoserve, an existing Prestoserve patch was available in the latest patch kit that fixed another problem.

It is always a good idea to install any related patches (in this case presto.mod) even if the patch README does not specifically mention this problem for a couple of reasons:

Troubleshooting AdvFS 5-47

Page 330: Dunix Student

Case Study 3: Corruption and System Panic

• The patch README does not always list all the things fixed by the patch.

• The patch may change the behavior of the problem and may provide additional information that can assist in troubleshooting.

So, although the patch README did not mention this problem, we had the customer install it to see if it might meet one of the two criteria above.

The engineer soon found that there was a locking problem in the Prestoserve module and produced a patch. The first version of this patch was not successful because it caused NFS to lock up. The second iteration of the patch proved to solve the problem.

Things Attempted: Case Study 31. Installed available AdvFS, Prestoserve option, NFS and Tru64 UNIX kernel

patches.

2. Upgraded to latest release of Tru64 UNIX.

3. Replaced the CPU due to some CPU exceptions reported in the binary errorlog file.

Final Solution/Summary: Case Study 3• The domains had to be remade and restored from backup.

• The customer system after problem resolution was running DIGITAL UNIX Version 4.0b with a new CPU, and a Prestoserve module kernel patch.

• The final solution was a Prestoserve module kernel patch provided by engineering.

Turning off the Prestoserve option was actually one of the first things that the specialist recommended to the customer. The customer refused to run this test unless we could prove to him beforehand that Prestoserve was causing the problem. He felt that the performance gained by using this option was a necessity on this system which acted as the main NFS server for the entire campus. The customer preferred to run at DIGITAL UNIX Version 3.2g, than to accept the performance degradation of turning off the Prestoserve option.

One reason the customer wanted to upgrade from DIGITAL UNIX Version 3.2g to DIGITAL UNIX Version 4.0a was to obtain the performance improvement from the unfunneling of AdvFS in 4.0x. After approximately two months of ineffective troubleshooting, the customer was willing to try running while disabling the Prestoserve option. The file system corruption problem was resolved under this test situation indicating that the source of the corruption was the Prestoserve option in conjunction with AdvFS. The problem was subsequently isolated to the Prestoserve driver for DIGITAL UNIX Version 4.0x.

5-48 Troubleshooting AdvFS

Page 331: Dunix Student

Case Study 3: Corruption and System Panic

As you will recall from the stack trace, the panic did not include Prestoserve. Therefore, the panic was a symptom, not an indication that there was a problem in any of the subsystems that showed up in the trace.

Disabling the Prestoserve option did cause a loss of performance that the customer was displeased with. Once the Prestoserve patch was installed, they enabled the option and regained the performance.

Troubleshooting AdvFS 5-49

Page 332: Dunix Student

Using the salvage Utility

Using the salvage Utility

What is salvage?salvage is a new AdvFS utility available in the Tru64 UNIX Version 5.0 release. A field test version of salvage is available in DIGITAL UNIX V4.0D. (The latest versions of salvage for V4.0x and earlier releases of DIGITAL UNIX can be obtained from UNIX Support Engineering Group (USEG) at the File systems and Cluster Support Web page on sunny.alf.dec.com.

salvage can recover information at the block level from disks containing damaged AdvFS domains (that is, filesets cannot be mounted).

The syntax for the salvage utility is as follows:

/sbin/advfs/salvage [-x|-p] [-l] [-S] [-v number] [-d time] [-D directory] [-f archive] [-F format] [-L path] [-o option]{ -V special [-V special]... | domain } [fileset[path]]

The domain command specifies the name of an existing AdvFS file domain from which filesets are to be recovered. Use this parameter when you want the utility to obtain volume information from the /etc/fdmns directory. The volume information used by the utility consists of the device special file names of the AdvFS volumes in the file domain. When the domain parameter is specified without optional arguments, the utility attempts to recover the files in all filesets in the domain. Do not use this parameter when you want to use the -V special flag to specify device special file names of AdvFS volumes. If you do, the utility displays an error message and exits with an exit value of 2.

The fileset [path] command specifies the name of a fileset to be recovered from a domain or a volume. Specify path to indicate the path of a directory or file in a fileset. When you specify a path that is a directory, the utility attempts to recover only the files in that directory tree, starting at the specified directory. When you specify a path that is a file, the utility attempts to recover only that file. Specify path relative to the mount point of the fileset.

Table 5-4: salvage Options

Option Function

-d time Specifies the time, as a decimal number in this format: [[CC]YY]MMDDhhmm[.SS]When specified, salvage will recover only those files modified since this time.

-D directory Specifies the path of the directory to which all recovered files are written. If you do not specify a directory, the utility writes recovered files to the current working directory.

-f [archive] Use the next argument as the name of an archive. If "-", salvage writes to standard output.

-F format Specifies that salvage should recover files in an archive format. The only legitimate value is tar (currently V5.0).

5-50 Troubleshooting AdvFS

Page 333: Dunix Student

Using the salvage Utility

Operation

The salvage utility helps you recover file data after an AdvFS file domain has become unmountable due to some type of data corruption. Errors that could cause data corruption of a file domain include I/O errors in file system metadata or the accidental removal of a volume.

As the utility recovers files, it saves relevant information in memory. It requires enough disk space to save the recovered files plus the log file.

-l Specifies verbose mode for messages written to the log file for every file encountered during the recovery. If you do not specify this flag, the utility writes a message to the log file only for partially recovered and unrecovered files.

-L path Specifies the path of the directory or the file name for the log file you choose to contain messages logged by this utility. If you include a log file name in the path, the utility uses that file name. If no log file name is specified, the utility places the log file in the specified directory and names it salvage.log.pid (pid is the process ID of the user process). When you do not specify this flag, the utility places the log file in the current working directory and names it salvage.log.pid.

-o option Specifies the action the utility takes when a file being recovered already exists in the directory to which it is to be written. The values for option are:

• yes: Overwrites the existing file without querying the user. This is the default action when option is not specified.

• no: Does not overwrite the existing file.

• ask: Asks the user whether to overwrite the existing file.

-S Specifies that utility is to run in sequential search mode, checking each page on each volume in domain; takes a long time on large AdvFS file domains. This flag can recover most files from a domain damaged from an incorrect execution of the mkfdmn utility. In some cases, recovery must generate names based on file’s tag number. These cases usually happen in root directory, because mkfdmn usually overwrites this directory.

When you specify this flag, there may be a security issue because the utility could recover old filesets and deleted files.

-v number Specifies the type of messages directed to stdout. If you do not specify this flag, the default is to direct only error messages to stdout. If you specify number to be 1, both errors and the names of partially recovered files are directed to stdout. If you specify number to be 2, error messages and the status of all files as they are recovered are directed to stdout.

-V special [-V special]

Specifies the block device special file names of volumes in the domain, for example, /dev/disk/dsk3c. The utility attempts to recover files only from the volumes you specify. If you do not specify the -V flag, you must specify the domain parameter so that the utility can obtain the special file names of the volumes in the domain from the /etc/fdmns directory. Do not use this flag with the domain parameter. If you do, an error message is displayed and the utility exits with an exit value of 2.

-x Specifies that partially recoverable files are not to be recovered. If you do not use this flag, partial files are recovered.

Do not use the -x flag with the -p flag. If you do, the utility displays an error message and exits with an exit value of 2.

Table 5-4: salvage Options (Continued)

Troubleshooting AdvFS 5-51

Page 334: Dunix Student

Using the salvage Utility

ors file, more

t a hile

le

re

Running the salvage utility does not guarantee that you will recover all your information. You may be missing files, directories, file names, or parts of files. The utility generates a log file that contains the status of files that were recovered. Use the -l flag to print the status of all files that are encountered. There is a lost+found directory that lists files for which no parent directory can be found.

salvage places recovered files in directories named after the filesets. Recovered information must be moved to new filesets before you can remount the files as a fileset. You can specify the path name of the directory into which the files are recovered. If you do not specify a directory, the utility writes recovered files to the current working directory.

The results of using this utility can include some fully recovered files, partially recovered files, and unrecovered files. A partially recovered file is one that salvage could not obtain all of the file’s data due to corrupt metadata, I/O errreading the disk or missing volumes. If the partially recovered file is an ASCII the user may be able to fill in the missing data. Compaq cannot provide any recovery assistance of these types of files at this point.

The salvage utility opens and reads block devices directly and could presensecurity issue if it recovers data remaining from previous AdvFS file domains wattempting to recover data from current AdvFS file domains.

The salvage utility can be run in single user mode, without mounting other fisystems. The salvage utility is available from the UNIX Shell option when youare starting from the Tru64 UNIX operating system volume 1 CDROM.

You must have root user privilege to use the salvage utility.

salvage ExamplesThe following example shows a salvage command that uses all the defaults torecover all files from the AdvFS file domain named user_domain. Other results include:

• A log file named salvage.log.pid is written to the fixit directory.

• The files recovered from the user_domain are also written to the fixit directory.

• Partially recoverable files are included in the recovered files. These files awritten to the fixit directory.

# cd /fixit # /sbin/advfs/salvage user_domain

This example shows a salvage command that uses the -d option to recover all files in the domain user_domain that have been changed after that date.

# cd /fixit # /sbin/advfs/salvage -d 199611200000 user_domain

5-52 Troubleshooting AdvFS

Page 335: Dunix Student

Using the salvage Utility

The following example shows a salvage command that recovers the file data.file, whether or not it is only partially recoverable, from the fileset user_fileset on the volume mounted as /dev/disk/dsk3c. The data.file file is written to the recovery directory and is logged in the log file (only if it was partially recovered).

# cd /fixit # /sbin/advfs/salvage -V /dev/disk/dsk3c user_fileset/data.file

The following example shows a salvage command that recovers the file data.file, only if it is fully recoverable, from the fileset user_fileset on the domain user_domain. The data.file file, if it is not recovered, is logged in the log file. Otherwise, it is written to the recovery directory.

# cd /fixit # /sbin/advfs/salvage -x user_domain user_fileset/data.file

When to Use salvageUse salvage as a last resort to recover file data from a damaged file domain. Before using the salvage utility:

1. Repair domain structures using the verify utility.

2. Attempt to recover the fileset data from backup media if the verify utility does not solve the problem.

Only if both methods are unsatisfactory should you use the salvage utility.

Remember that running the salvage utility does not guarantee that you will recover all your information. You may be missing files, directories, file names, or parts of files. The utility generates a log file that contains the status of files that were recovered. Use the -l flag to print the status of all files that are encountered.

Since salvage may only be partially successful recovering the files, this should not be construed as a replacement for backups. Compaq recommends that regular backups be performed on any critical system data (as defined by the customer), and any corruption issues be dealt with by restoring any corrupted files from backups.

Using salvage in Conjunction with Backup MediaIn cases where the backup is not recent enough, salvage can be used in conjunction with the most recent backup to obtain current copies of files. These steps define how to perform this task:

1. Create a new file domain with the mkfdmn command.

2. Create new filesets and mount them.

3. Restore from backup to the new filesets.

Troubleshooting AdvFS 5-53

Page 336: Dunix Student

Using the salvage Utility

4. Run the salvage utility with the -d (date) flag set to recover files that have changed since the backup.

5. Move recovered files to new filesets. salvage places recovered files in directories named after the original filesets.

Using salvage in the Absence of Backup MediaIn cases where there is no backup media, salvage can be used, without the -d option, to recover all the filesets in the domain regardless of the date associated with the files.

After running salvage:

1. Mount new filesets.

2. Move recovered files to new filesets. salvage places recovered files in directories named after the filesets.

Using salvage in the Case of Very Large DomainsIn case of very large domains, you may want to recover one fileset at a time if there is not enough space to store an entire domain.

You could also output to tape using -F and -f (in tar format) if short on disk space.

Using salvage in the Case of Massive Metadata CorruptionIf previous executions of salvage indicated that significant portions of metadata could not be found or a domain has been destroyed by accidental use of the mkfdmn utility, you can use salvage with the -S flag to recover data.

Using the -S flag specifies the slowest, most complete disk search for data. The utility runs in sequential search mode, checking each page on each volume in the domain. This flag can be used to recover most files from a domain which has been damaged from an incorrect execution of the mkfdmn utility. In some cases, the recovery will generate names based on the file’s tag number. These cases usually happen in the root directory, because mkfdmn usually overwrites root directory metadata.

If a file is fully recovered but has lost its file name, the customer must try to find out the old name, use the name assigned by salvage or provide a new name based on the data in the file. This is similar to the process used to recover lost files in the UFS lost+found directory. If a file is only partially recovered, the customer must decide if there is any useful data in the file and reconstruct the file, or continue without it.

When you specify this flag, there may be a security issue because the utility could recover old filesets and deleted files.

5-54 Troubleshooting AdvFS

Page 337: Dunix Student

Summary

Summary

Describing AdvFS Troubleshooting PracticesThese troubleshooting practices were described:

• Describe the problem and any relevant circumstances surrounding the problem as much as possible.

• Check for hardware-related causes of the problem.

• Check specific locations for any error messages.

Check the advfs_err(4) reference page to find a brief description based on an error number.

• Search CANASTA if a panic is involved.

• If you think it might be a bug in the software, research the reported bugs and patches for potential similarities.

• Use system tools to check for problems.

• Use AdvFS tools and utilities to check for and to fix problems.

Troubleshooting File System CorruptionThe symptoms of a file system corruption a customer might report include:

• System panic

• Domain panic

• Corrupted data

• Unexpected behavior after entering ordinary commands on files in an AdvFS file system

Resolving Known AdvFS IssuesThese known issues were described in this section:

• Log half full

• BMT exhaustion

Troubleshooting AdvFS 5-55

Page 338: Dunix Student

Summary

Performing Case StudiesThese three case studies were described:

• A domain corruption problem

The problem statement received from the customer was:

"Following UNIX upgrade from 3.2a to 3.2c, multi-volume AdvFS domains will not mount."

The configuration the customer experienced the problem with consisted of two AlphaServer 2100 systems running DIGITAL UNIX Version 3.2a and DECsafe ASE Version 1.2

• A 1K fragment-free list corruption problem.

The problem statement received from the customer was:

"Following the creation of a new file on an existing AdvFS domain, it is noticed that many other files on the same domain now contain the same data."

The configuration that the customer experienced the problem with consisted of an AlphaServer 2100 running DIGITAL UNIX Version 3.2d.

• An AdvFS file system corruption problem and resultant system panic.

The problem statement received from the customer was:

"AdvFS file systems become corrupted following upgrade from DIGITAL UNIX Version 3.2g to DIGITAL UNIX Version 4.0a".

The configuration that the customer experienced the problem with consisted of an AlphaServer 2100 running DIGITAL UNIX Version 4.0a and the Prestoserve option. Prestoserve is a nonvolatile hardware cache used to speed up synchronous access to file systems. This option consists of a hardware card and a software driver and utilities. The system was the main NFS server for the campus.

Compaq (Seagate) disk drives were not used in the configuration.

Using salvage

salvage is a new AdvFS utility available in the Tru64 UNIX Version 5.0 release. (Versions of salvage for earlier versions of Tru64 UNIX can be obtained from the File systems and Clusters Support Web page at sunny.alf.dec.com.)

salvage can recover information at the block level from disks containing damaged AdvFS domains (that is, filesets cannot be mounted).

Use salvage as a last resort to recover file data from a damaged file domain. Before using the salvage utility:

1. Repair domain structures using the verify utility.

2. If the verify utility does not solve the problem, attempt to recover the fileset data from backup media.

5-56 Troubleshooting AdvFS

Page 339: Dunix Student

Summary

Only if both methods are unsatisfactory should you employ the salvage utility.

Remember that running the salvage utility does not guarantee that you will recover all your information. You may be missing files, directories, file names, or parts of files. The utility generates a log file that contains the status of files that were recovered.

Since salvage may only be partially successful recovering the files, this should not be construed as a replacement for backups. Compaq recommends that regular backups be performed on any critical system data (as defined by the customer), and any corruption issues be dealt with by restoring any corrupted files from backups.

Troubleshooting AdvFS 5-57

Page 340: Dunix Student

Exercises

Exercises

Describing AdvFS Troubleshooting PracticesThese questions provide a review of the material.

1. Which database should you search if there is a system panic involved with the problem?

2. Which AdvFS commands are new in Tru64 UNIX Version 5.0?

3. Why would you use the syscheck tool?

4. What are the common causes of AdvFS corruption?

5. In the case of generalized AdvFS corruption, what steps should you take to troubleshoot the problem?

6. Under what conditions should you use the salvage utility?

5-58 Troubleshooting AdvFS

Page 341: Dunix Student

Solutions

stem cks

Solutions

Describing AdvFS Troubleshooting Practices1. CANASTA is a Compaq internal crash dump analysis tool being used world-

wide inside Compaq to store and evaluate crash footprint information for OpenVMS Alpha, OpenVMS VAX and Tru64 UNIX system crashes. CANASTA uses AI technology to provide solutions or additional troubleshooting information for system crash problems. The CANASTA tool is typically used in the CSCs, but access to the CANASTA knowledge database is also available using the CANASTA Mail Server, TIMA STARS and COMET.

By using the AutoCLUE tool, customer crash dump information can be automatically sent to Compaq using DSNlink and will be analyzed using the DSNlink CLUE post-processor. Solution information, if available, can be automatically returned to the customer and/or included in the call handling system. (See http://hanhwr.hao.dec.com/CANASTA.HTML#CANASTA Overview for more information.)

2. These AdvFS commands are new in Tru64 UNIX Version 5.0:

3. The sys_check tool is a ksh script that can be useful when debugging or diagnosing system problems. The script generates an HTML file of a Tru64 UNIX configuration. This script has been tested on DIGITAL UNIX Version 3.2*, and Version 4.0 systems. (See http://www-unix.zk3.dec.com/tuning/tools/sys_check/sys_check.html.)

4. AdvFS corruption is usually caused by one of the following:

— Hardware problem

Hardware problems are the most common sources of AdvFS-related sypanics. One common cause of corruption in any file system is bad bloon the physical disk. Another common cause is outdated firmware revisions.

nvbmtpg Displays pages of an AdvFS BMT file.

nvfragpg Displays the pages of an AdvFS fragment file.

nvlogpg Displays the log file of an AdvFS file domain.

nvsbmpg Displays a page of the Storage BitMap (SBM) file.

nvtagpg Displays a page formatted as a tag file page.

salvage Recovers file data from damaged AdvFS file domains.

vdf Displays disk information for AdvFS domains and filesets.

Troubleshooting AdvFS 5-59

Page 342: Dunix Student

Solutions

shes

ally the be

hes

re ny

. In his

in.

— Uncontrolled system shutdown

AdvFS is generally robust enough to withstand unexpected system craor power outages, but may still cause corruption in certain cases.

— Software bugs in the AdvFS software

Software bugs can often be reproduced. AdvFS software bugs are usufixed by patches. Any available, relevant patches should be applied ininitial stages of troubleshooting a problem. Available resources shouldchecked for relevant patches since it is not always obvious which patcmight be relevant to AdvFS.

5. Possible troubleshooting actions for generalized corruption include:

— Check the binary errorlog for bad block replacements or other hardwaevents. If excessive, ensure the hardware problem is resolved before aother action.

— You can try adding volumes and removing the volumes having problemsthe case of general corruption, this will probably not solve the problem. Tprocess is time consuming with a large number of bad files.

6. Use salvage as a last resort to recover file data from a damaged file domaBefore using the salvage utility:

— Repair domain structures using the verify utility.

— If the verify utility does not solve the problem, attempt to recover thefileset data from backup media.

Only if both methods are unsatisfactory should you use the salvage utility.

5-60 Troubleshooting AdvFS

Page 343: Dunix Student

A

AdvFS Commands and Utilities

AdvFS Commands and Utilities A-1

Page 344: Dunix Student

About This Appendix

About This Appendix

IntroductionThis appendix describes a subset of the AdvFS command set and utilities that are particularly useful when working with AdvFS internals and troubleshooting. This appendix is intended to be used as an additional reference.

Topics:

• List of AdvFS commands

ResourcesFor more information on topics in this chapter as well as related topics, see the following:

• AdvFS Reference Pages

A-2 AdvFS Commands and Utilities

Page 345: Dunix Student

AdvFS Commands and Utilities

AdvFS Commands and Utilities

OverviewThis section describes a subset of the AdvFS commands and utilities that are particularly useful when troubleshooting configurations that include AdvFS and when learning about AdvFS internals. The command and utility information in this appendix is based on the Tru64 UNIX Version 5.0 (STEEL) code base.

Commands in Certain Versions of Tru64 UNIXBetween DIGITAL UNIX Version 3.2x, 4.0x, and the version 5.0 release of Tru64 UNIX, some AdvFS commands have been replaced or functionally modified. The following table lists some key commands that differ in different versions.

addvolDescription

The addvol command adds a volume to an existing file domain.

/usr/sbin/addvol [-F ] [-x num_pages] [-p num_pages] special domain

special specifies the block special device name of the disk that you are adding to the file domain. domain specifies the name of the file domain.

DIGITAL UNIX Version 3.2x Command

DIGITAL UNIX Version 4.0x Command

Tru64 UNIX Version 5.0 Command

msfsck verify verify

vchkdir verify verify

vods vbmtpg,

vbmtchain

nvbmtpg

no equivalent? vfragpg nvfragpg

logread vlogpg,

vlsnpg

nvlogpg

no equivalent? vtagpg nvtagpg

no equivalent salvage (field test version in DIGITAL UNIX 4.0D)

salvage

no equivalent no equivalent savemeta

no equivalent no equivalent vdf

no equivalent vfile vfilepg

no equivalent no equivalent vsbmpg

AdvFS Commands and Utilities A-3

Page 346: Dunix Student

AdvFS Commands and Utilities

Options

The flags -x numpages and -p numpages will be retired in a future release of the operating system. Users should migrate away from using these flags. These flags were necessary in previous releases to manipulate contiguous storage for bitfile metadata table (BMT) operations. In Tru64 UNIX Version 5.0, storage for BMT operations is managed internally by the operating system.

Operation

A newly created file domain consists of one volume, which can be a disk or a logical volume. The addvol utility enables you to increase the number of volumes within an existing file domain. You can add volumes immediately after creating a file domain, or you can wait until the filesets within the domain require additional space.

For optimum performance, each volume that you add should consist of the entire disk (typically, partition c). Existing data on the volume you add is destroyed during the addvol procedure. Do not add a volume containing data that you want to keep.

The addvol command checks for potential overlapping partitions before adding the volume. If you try to add a volume that would cause partitions to overlap with any other file systems, including Logical Storage Manager (LSM), UNIX file system (UFS), and AdvFS, or that would overlap with blocks currently in use, the following message is displayed and the volume is not added:

/dev/rdisk/dsk1g or an overlapping partition is open.Quitting ....addvol: Can’t add volume ’/dev/rdisk/dsk1g’ to domain ’proj_x’

If you try to add a volume that would cause partitions to overlap with other file systems, but none of the partitions are currently in use, you can choose to continue with the procedure or stop. Use the -F flag to disable testing for overlap. Disabling the overlap check can result in extensive data loss and should be used with extreme caution.

Adding volumes to a file domain does not affect the logical structure of the filesets within the file domain. You can add a volume to an active file domain while its filesets are mounted and in use. While up to 256 volumes per domain are allowed, limiting the number of volumes to three decreases the risk of disk errors that can cause the entire domain to become inaccessible.

-F Ignores overlapping partition or block warnings.

-x numpages Sets the number of pages by which the bitmap metadata table extent size grows. The default is 128 pages.

-p numpages Sets the number of pages to preallocate for the bitmap metadata table. The default is 0 pages.

A-4 AdvFS Commands and Utilities

Page 347: Dunix Student

AdvFS Commands and Utilities

The /etc/fdmns directory contains subdirectories named for the AdvFS domains defined on the system. Within each subdirectory there is a symbolic link (or more than one for multivolume domains) that points to the block device file that contains the data for the AdvFS volume. When you recreate a file domain (due to destruction of the information /etc/fdmns, you must rebuild the /etc/fdmns directory and any symbolic links within the domain subdirectory. It is good practice to maintain a hardcopy record of each volume you have since you must have the names of all the volumes in the domain to manually recreate the /etc/fdmns directory. You can use the advscan command to recreate the links for a file domain.

You cannot exceed 256 volumes per file domain. Also, you must have root user privilege to access this utility. The AdvFS Utilities license must be present for addvol to run.

AdvFS does not support a multivolume root file system. You cannot use the addvol utility to expand the root domain.

DIGITAL UNIX V4.x Specific Information for addvol( -x and -p will be retired)Systems with domains that contain very large numbers of files can use more metadata extents (similar to inodes in UFS) than normal. By default, AdvFS attempts to grow the bitmap metadata table (BMT) by 128 pages each time additional metadata extents are needed. Frequent requests by the system to increase the BMT causes the metadata to become very fragmented, which can result in an out of disk space error when there is actually space available.

You can reduce the amount of metadata fragmentation in one of two ways:

• Preallocating all of the space for the BMT when the volume is added

• Increasing the number of pages that the system attempts to grow the metadata table each time more space is needed.

To preallocate all the BMT space that you expect the file domain to need, use the mkfdmn command with the -p flag set to specify the number of pages to preallocate. Space that is preallocated to the BMT cannot be deallocated. Do not preallocate excessive space for the BMT. The following table provides BMT page number estimates for numbers of files.

To set the BMT to grow by more than 128 pages each time additional metadata extents are needed, use the addvol command with the -x flag set to specify a number of pages greater than 128. You can increase the number of pages to any value; the following table shows suggested guidelines.

Number of Files Extent Size (pages) Metadata Table Size (pages)

Less than 50,000 Default (128) 3,600

100,000 256 7,200

AdvFS Commands and Utilities A-5

Page 348: Dunix Student

AdvFS Commands and Utilities

To get the maximum benefit from increasing the number of metadata table extent pages, use the same number of pages when adding a volume with the addvol command as was assigned when the domain was created with the mkfdmn command.

advfsstatDescription

The advfsstat command displays AdvFS performance statistics.

/usr/sbin/advfsstat [options] [stats-type] domain/usr/sbin/advfsstat [options] -f 0 | 1 | 2 domain fileset

domain specifies the name of an existing domain. fileset specifies the name of an existing fileset.

200,000 512 14,400

300,000 768 21,600

400,000 1024 28,800

800,000 2048 57,600

Option Function

-i sec Specifies time interval (in seconds) between displays. advfsstat collects and reports information only for the specified interval. If sec is omitted, advfsstat uses a default interval of one second.

-c count Specifies the number of reports. For example, setting the advfsstat command flags -i 1 and -c 10 would produce 10 reports at 1 second intervals. If count is omitted, advfsstat returns one report.

-s Displays raw statistics for the interval.

-R Displays the percent ratio of the returned statistics. (Use only with -b, -p, or -r flags.)

Stats-types Function

-b Displays the buffer cache statistics for the selected domain.

-f 0 Displays all fileset vnop statistics for the selected fileset.

-f 1 Displays all fileset lookup statistics for the selected fileset

-f 2 Displays common fileset vnop statistics.

-l 0 Displays basic lock statistics.

-l 1 Displays lock statistics.

-l 2 Displays detailed lock statistics.

-n Displays namei cache statistics.

Number of Files Extent Size (pages) Metadata Table Size (pages)

A-6 AdvFS Commands and Utilities

Page 349: Dunix Student

AdvFS Commands and Utilities

Operation

The advfsstat command displays a wide selection of AdvFS performance statistics. It reports in units of one disk block (512 bytes) per interval with the default being one second. Any number of options (listed in the options table) may be used. The -R option may be specified only with the stats-types of -b, -p, and -r. The options -i and -c require parameters.

Only one stats-type (listed in the stats-type table) may be specified with the command. The -f, -l, -v, and -B stats-types require parameters. For the -f stats-type, the fileset parameter must also be specified.

advscanDescription

The advscan command locates AdvFS volumes (disk partitions or LSM disk groups) that are in AdvFS domains.

/sbin/advfs/advsan [-g] [-a] [-r] [-f domain_name] devices... disk_group...

devices specifies the device names of the disks to scan. disk_group specifies the Logical Storage Manager (LSM) disk groups to scan for AdvFS volumes.

Use the advscan command when you have moved disks to a new system, have moved disks around in a way that has changed device numbers or have lost track of where the domains are. The command is also used for repair if you delete the /etc/fdmns directory, delete a file domain directory in the /etc/fdmns directory, or delete links from a file domain directory under the /etc/fdmns directory.

-p Displays buffer cache pin statistics.

-r Displays buffer cache ref statistics.

-S Displays smoothsync queue statistics.

-v 0 Displays volume read/write statistics.

-v 1 Displays detailed volume statistics.

-v 2 Displays volume I/O queue statistics as a snapshot of everything currently on the queue.

-v 3 Displays volume I/O queue statistics for everything put on the queue during the last interval (-i).

-B r Displays BMT record read statistics.

-B w Displays BMT record write/update statistics.

AdvFS Commands and Utilities A-7

Page 350: Dunix Student

AdvFS Commands and Utilities

Options

Operation

The advscan command locates AdvFS volumes (disk partitions or LSM volumes) that are in AdvFS domains. Given the AdvFS volumes, you can recreate or fix the /etc/fdmns directory of a named domain or LSM disk group. For example, if you have moved disks to a new system, moved disks around in a way that has changed device numbers, or have lost track of where the AdvFS domains are, you can use this command to locate them.

Another use of the advscan command is to repair AdvFS domains when you have broken them. For example, if you mistakenly delete the /etc/fdmns directory, delete a domain directory in the /etc/fdmns directory, or delete links from a domain directory under the /etc/fdmns directory, you can use the advscan command to fix the problem.

The advscan command accepts a list of disk device names and/or LSM disk group names and searches all the disk partitions to determine which partitions are part of an AdvFS domain.

You can run the advscan command to rebuild all or part of your /etc/fdmns directory or you can rebuild it manually by supplying all the names of the AdvFS volumes in a domain.

If the -g flag is not set, the AdvFS volumes are listed as they are grouped in domains. Set the flag to list the AdvFS volumes in the order they are found on each disk.

Run the advscan command with the -r flag set to recreate missing domains from the /etc/fdmns directory, missing links, or the entire /etc/fdmns directory.

Although the advscan command will rebuild the /etc/fdmns directory automatically, Compaq recommends that you always keep a hardcopy record of the current /etc/fdmns directory.

To determine if a partition is part of an AdvFS domain, the advscan command performs the following functions:

• Reads the first two pages of a partition to determine if it is an AdvFS partition and to find the file domain information.

-g Lists the partitions in the order they are found on disk.

-a Scans all disks found in any /etc/fdmns domain as well as those in the command line.

-r Recreates missing domain. The domain name is created from the device names or LSM disk group names.

-f domain_name Fixes the domain count and the links for the named domain.

A-8 AdvFS Commands and Utilities

Page 351: Dunix Student

AdvFS Commands and Utilities

• Reads the disk label to sort out overlapping partitions. The size of overlapping partitions are examined and compared to the file domain information to determine which partitions are in the file domain. These partitions are reported in the output.

• Reads the boot block to determine if the partition is AdvFS root startable.

The advscan command displays the date the domain was created, the on-disk structure version, and the last known or current state of the volume.

To mount an AdvFS file system into a domain, the domain must be consistent. An AdvFS domain is consistent when the number of physical partitions or volumes with the correct domain ID are equal to both the domain volume count (which is a number stored in the domain) and the number of links to the partitions in the /etc/fdmns directory.

Domain inconsistencies can occur in diverse ways. Use the -f flag to correct domain inconsistencies. If you attempt to mount an inconsistent domain, a message similar to the following will appear on the console:

# Volume count mismatch for domain dmnz.dmnz expects 2 volumes, /etc/fdmns/dmnz has 1 links.

You must have root user privilege to access this command.

balanceDescription

The balance utility balances the percentage of used space among volumes in a domain.

/usr/sbin/balance [-v] domain

domain specifies the name of the file domain.

Options

Operation

The balance utility evenly distributes the percentage of used space between volumes in a multivolume domain. This improves performance and evens the distribution of future file allocations.

Use the showfdmn command to determine the percentage of used space on each volume. This information allows you to determine the need to balance the volumes.

The balance utility can be used at any time, but it is particularly useful after adding or removing a volume (addvol, rmvol) because these procedures can cause file distribution to become uneven.

-v Displays information on which files are being moved to different volumes. Selecting this flag slows down the balance procedure.

AdvFS Commands and Utilities A-9

Page 352: Dunix Student

AdvFS Commands and Utilities

When you plan to run both the defragment and balance utilities on the same domain, run the defragment utility before running the balance utility. The defragment utility often improves the balance of free space, this enabling the balance utility to run more quickly.

Before you can balance volumes in a file domain, all filesets in the file domain must be mounted. If you try to balance volumes in an active file domain that includes unmounted filesets, the system displays an error message indicating that a fileset is unmounted.You must have root privilege to access this utility. The AdvFS utilities license must be present.

You cannot run the balance utility while the addvol, rmvol, defragment, rmfset, or balance utility is running on the same file domain. If you attempt to do this, a warning message is displayed.

The balance utility does not operate on striped files in the domain and does not include them in its calculations on used space.

chfileDescription

The chfile command lets you change attributes of an AdvFS file.

/usr/sbin/chfile [-l on | off] [-L on | off] filename ...

filename specifies one or more file names.

Options

Operation

The chfile command lets you view or change attributes of an AdvFS file. The only file attribute that can be set with the chfile command is the I/O mode used when write requests are made to the file. There are three settings for this I/O mode:

• Asynchronous I/O

The default setting. Write requests are cached, the write system call returns to the calling program, and later (asynchronously), the data is written to the disk.

-l on | off Enables or disables (on | off) forced synchronous write requests to the specified filename. By default, forced synchronous write requests to a file are off.

-L on | off Enables or disables (on | off) atomic write data logging on the specified filename. By default, atomic write data logging is turned off.

A-10 AdvFS Commands and Utilities

Page 353: Dunix Student

AdvFS Commands and Utilities

• Forced synchronous I/O

When this mode is enabled, write requests to a file behave as if the O_SYNC option had been set when the file was opened. The write system call returns a success value only after the data has been successfully written to disk.

• Atomic write data logging I/O

When this mode is enabled, write requests to a file are asynchronous. However, the write requests are also written to the AdvFS log file. Should a system crash during or after a write system call when this mode is enabled for the file, only complete write requests will be in the file on disk. This atomic operation guarantees that all (or none) of a write buffer will be in the file and that there will not be portions of the write request in the file. For example, suppose a write of an 8192-byte buffer was started and, during the write system call (or shortly thereafter) the system crashed. When the system was rebooted, either the entire 8192 bytes of data would be written to the file or none of it would have been written to the file. There would be no chance that some (but not all) bytes of the write request would be in the file.

The -l and -L options are mutually exclusive. You cannot simultaneously enable both forced synchronous writes and atomic write data logging on a file. However, you can override the current I/O mode for a file. For example, you can change a file’s I/O mode setting from forced synchronous writes to atomic write data logging by using the chfile -L on command.

If you do not use the options, the command displays the current state of the file’s I/O attribute.

Use the chfile command on AdvFS files that have been remotely mounted across NFS. You can run the chfile command on an NFS client to examine or change the I/O mode of AdvFS files on the NFS server.

Enabling atomic write data logging for a file will retard performance because the data is written to both the user file and the AdvFS log file. Enabling forced synchronous writes to a file also can retard system performance.

To use the chfile command on AdvFS files that are mounted across NFS, the NFS property list daemon, proplistd, must be running on the NFS client and the fileset must have been mounted on the client using the proplist option.

Only writes of up to 8192 bytes are guaranteed to be atomic for files that use atomic write data logging. When writing to an AdvFS file that has been mounted across NFS, a further restriction applies: the offset into the file of the write must be on an 8K page boundary, because NFS performs I/O on 8K page boundaries.

The showfile command does not display the I/O mode for files that are mounted across NFS. To display the I/O mode of these files, use the chfile command.

AdvFS Commands and Utilities A-11

Page 354: Dunix Student

AdvFS Commands and Utilities

Usually AdvFS, when operating on small files that do not have a size that is a multiple of 8K, puts the last part of the files (their frags) into a special metadata file called the fileset frags file as a way to reduce disk fragmentation. For example, a file that does not use atomic write data logging and has had 20K of data written to it will occupy 20K of disk space (as displayed by the du command).

Files that use atomic write data logging are exempt from this behavior. As a result, they always have a disk usage (as displayed by the du command) that is a multiple of 8K. For example, a file that has atomic write data logging enabled and has had 20K of data written to it occupies 24K of disk space.

If a file has a frag, an attempt to activate atomic write data logging on it will fail.

Files that use atomic write data logging cannot be memory-mapped through the mmap system call. The error ENOTSUP is returned if the attempt is made. If a file has been memory-mapped through the mmap system call, an attempt to activate atomic write data logging on it fails with the same error.

chfsetsDescription

The chfsets command enables you to change fileset quotas (file usage limits and block usage limits).

/sbin/chfsets [-F limit] [-f limit] [-B limit] [-b limit] domain [fileset...]

domain specifies the name of the file domain. fileset specifies the name of one or more filesets.

Options

Operation

Filesets can have both soft and hard disk storage and file limits. When a hard limit is reached, no more disk space allocations or file creations which would exceed the limit are allowed. The soft limit may be exceeded for a period of time (called the grace period). The grace periods for the soft limits are set with the edquota command.

The command also displays the changes made to the file and block usage limits.

- F limit Specifies the file usage soft limit (quota) of the fileset.

- f limit Specifies the file usage hard limit (quota) of the fileset.

- B limit Specifies the block usage soft limit (quota) in 1K blocks of the fileset.

- b limit Specifies the block usage hard limit (quota) in 1K blocks of the fileset.

A-12 AdvFS Commands and Utilities

Page 355: Dunix Student

AdvFS Commands and Utilities

The chfsets command displays the following fileset information:

• Id is a unique number (in hexadecimal format) that identifies a file domain and fileset.

• File H limit is the files usage hard limit of the specified fileset before the change followed by the new limit.

• Block H limit is the block usage hard limit of the specified fileset.

• File S limit is the file usage soft limit of the specified fileset before the change followed by the new limit.

• Block S limit is the block usage soft limit of the specified fileset before the change followed by the new limit.

At least one fileset within the domain must be mounted for the chfsets command to succeed. You must have root user privilege to access this command.

chvolDescription

The chvol command enables you to change the attributes of a volume in an active domain.

/sbin/chvol [-r blocks] [-w blocks] [-t blocks] [-c on | off] [-A] special domain

special specifies the block special device name, such as /dev/disk/dsk2c. domain specifies the name of the file domain.

Options

-r blocks Specifies the maximum number of 512-byte blocks that the file system reads from the disk at one time.

-w blocks Specifies the maximum number of 512-byte blocks that the file system writes to the disk at one time.

-t blocks Specifies the maximum number of dirty, 512-byte blocks that the file system will cache in-memory (per volume in a domain). Dirty means that the data has been written by the application but the file system has cached it in memory so it has not yet been written to disk.

Number of blocks must be in multiples of 16; valid range is 0-32768. Default (when a volume is added to a domain) is 768 blocks. For optimal performance, specify blocks in multiples of wblks (as specified by -w)

-c on | off Turns I/O consolidation mode on or off.

-A Activates a volume after an incomplete rmvol operation.

-l Displays range of I/O transfer sizes

AdvFS Commands and Utilities A-13

Page 356: Dunix Student

AdvFS Commands and Utilities

If a

Operation

The file system can consolidate a number of I/O transfers into a single, large I/O transfer. The larger the I/O transfer, the better the file-system performance. If you attempt to change the attributes of a volume in a domain that is not active an error message is produced.

The initial I/O transfer parameter for both reads and writes is 128 blocks. Once you change the I/O transfer parameters with the -r flag or the -w flag, the parameters remain fixed until you change them. The values for the I/O transfer parameters are limited by the device driver. Every device has a minimum and maximum value for the size of the reads and writes it can handle. If you set a value that is outside of the range that the device driver allows, the device automatically resets the value to the largest or smallest it can handle.

By default, the I/O consolidation mode (cmode) is on. The cmode must be on for the I/O transfer parameters to take effect. You can use the -c flag to turn the cmode off, which sets the I/O transfer parameter to one page.

For file system workloads that are heavily biased toward random writes, use the -t flag to increase the file system’s dirty threshold. This may improve write performance.

Interrupting an rmvol operation can leave the volume in an inaccessible state.volume does not allow new allocations after an rmvol operation, use the chvol command with the -A flag to reactivate the volume.

Using the chvol command without any flags displays the current cmode and the I/O transfer parameters.

The values for the wblks and rblks attributes are limited by the device driver.You must have root user privilege to access this command.

defragmentDescription

The defragment utility makes the files in a file domain more contiguous.

/usr/sbin/defragment [-e] [-n] [-N threads] [-t time] [-T time] [-v] [-V] domain

domain specifies the name of the file domain.

Options

-e Ignores errors and continues, if possible. Errors that are ignored are usually related to a specific file.

-n Prevents defragmentation from actually taking place. Use in conjunction with the -v flag to display statistics on the number of extents in the file domain.

A-14 AdvFS Commands and Utilities

Page 357: Dunix Student

AdvFS Commands and Utilities

Operation

When a file consists of many discontiguous file extents, the file is fragmented on the disk. File fragmentation reduces the read/write performance because more I/O operations are required to access a fragmented file.

The defragment utility attempts to reduce the number of file extents in a file domain by making files more contiguous. Defragmenting a file domain often makes the free space on a disk more contiguous, resulting in less fragmented file allocations in the future.

Before you can defragment a file domain, all filesets in the file domain must be mounted. If you try to defragment an active file domain that includes unmounted filesets, the system displays an error message indicating that a fileset is unmounted.

To determine the amount of file fragmentation in a file domain before using the defragment utility, issue the defragment command with the -v and -n flags. This provides the fragmentation information without starting the defragment utility.

Before running the defragment utility, delete any files in the domain that you do not need. This gives the defragment utility more free space to use, which produces better results. Deleting files afterwards creates more free-space fragments. Additionally, run the balance utility on the domain before you run the defragment utility in order to balance domain free space before defragmenting the domain files.

To monitor the improvement made to the file domain by the defragment utility, use the verbose mode flag, -v, which displays the following information:

• The number of extents in the specified domain. (Contiguous extents in sparse files are counted as one extent after defragmentation, when in fact there are several contiguous file extents.)

-N threads Specifies the number of threads to run on the utility. The default number of threads that will be run is the number of volumes in the domain. The maximum number you can specify is 20.

-t time Specifies a flexible time interval (in minutes) for the defragment utility to run. If the utility is performing an operation when the specified time has elapsed, the procedure continues until the operation is complete.

-T time Specifies an exact time interval (in minutes) for the defragment utility to run. When the specified time has elapsed, the defragmentation procedure stops, even if it is performing an operation.

-v Displays statistics on the amount of fragmentation in the file domain and information on the progress of the defragment procedure.

-V Displays the same information provided by the -v flag along with information about each operation the defragment utility performs on each file. This flag slows the defragment procedure.

AdvFS Commands and Utilities A-15

Page 358: Dunix Student

AdvFS Commands and Utilities

• The number of files that have extents. (Note that files do not have extents if the files are so small that they are kept with the metadata.)

• The average number of extents for each file that has one or more extents.

• The efficiency of the entire file domain. An increase in value indicates improvement.

• The number of free-space fragments in the domain.

The defragment utility requires a minimum of 1 percent of the total space, or 5 megabytes per volume (whichever is less) to be free in order to run.

The defragment utility does not defragment striped files.

You cannot run the defragment utility while the addvol, balance, defragment, rmfset, or rmvol utility is running on the same file domain.

You must have root user privilege to access this utility.

logreadDescription

Replaced in V5 with the nvlogpg command.

migrateDescription

The migrate utility moves a file or file pages to another volume in an AdvFS domain.

/usr/sbin/migrate [-p pageoffset] [-n pagecount] [-s volumeindex] [-d volumeindex] filename

filename specifies the name of the file or file pages to be migrated from the volume. The file can be simple or striped.

Options

-p pageoffset Specifies the page offset of the first page to migrate. The first page of the file is page 0. The default page offset is 0. If you do not specify the -p flag, the migrate command migrates pages starting at page 0 of the file.

-n pagecount Specifies the number of pages to migrate, starting at the pageoffset value. The default pagecount is to EOF. If you do not specify the -n flag, the migrate command migrates pages from the pageoffset value to the end of the file.

A-16 AdvFS Commands and Utilities

Page 359: Dunix Student

AdvFS Commands and Utilities

Operation

The migrate utility moves the specified simple file to another volume in the same file domain. The utility also moves pages of a simple file or pages of a striped file segment to another volume (or volumes, if necessary) within the file domain.

Because there are no read/write restrictions when using this command, you can migrate a file while users are reading it, writing to it, or both, without disrupting file I/O. File migration is transparent to users.

When you run the migrate utility with only the -p and -n flags, the utility attempts to allocate the destination pages contiguously on one destination volume in the file domain. If there are not enough free, contiguous blocks to accomplish the move, the utility then attempts to allocate the pages to the next available blocks on the same volume. If there are not enough free blocks on the same volume, the utility then attempts to move the file to the next available volume or volumes. The utility returns an error diagnostic if it cannot accomplish the move.

You must use the -s, -n, and -p flags in order to move pages of a striped file from one volume to another. Only those pages assigned to the source volume are moved to the destination volume: all pages in the file are not moved.

You can use the migrate utility to move heavily accessed files or pages of files to a different volume in the file domain. Use the -d flag to indicate a specific volume. Also, you can use the utility to defragment a specific file, because the migrate utility defragments a file whenever possible.

You can only perform one migrate operation on the same file at the same time. When you migrate a striped file, you can only migrate from one source volume at a time.

You must have root user privilege to access this command.

-s volumeindex Specifies the volume index number of the volume from which the pages are to be migrated. Use the showfile -x command to determine the volume index number of the volumes in the AdvFS file domain.

If you specify the -s flag and the volume that contains the file does not contain any data extents of that file, the utility returns success without taking any action.

You must use the -s flag when you are migrating striped files. You can move pages of a striped file or a stripe file segment, which is the entire portion of a striped file that resides on the specified volume, to another volume.

-d volumeindex Specifies the volume index number of the volume to which the pages are to be migrated. You can determine the volume index number of the volumes in and AdvFS file domain by using the showfile -x command.

If you do not specify the -d flag, the file or file pages are moved to any volume or volumes with available space.

AdvFS Commands and Utilities A-17

Page 360: Dunix Student

AdvFS Commands and Utilities

mkfdmnDescription

The mkfdmn command creates a new AdvFS file domain.

/sbin/mkfdmn [-F] [-l num_pages] [-o] [-p num_pages] [-r] [-x num_pages] special domain

special specifies the block special device name, such as /dev/disk/dsk1c, of the initial volume that you use to create the file domain. domain specifies the name of the file domain.

Options

The flags -x numpages and -p numpages will be retired in a future release of the operating system. Users should plan to migrate away from use of these flags. The use of these flags was necessary in previous releases to manipulate contiguous storage for bitfile metadata table (BMT) operations. In Tru64 UNIX Version 5.0, storage for BMT operations is managed internally by the operating system using the RBMT.

Operation

The mkfdmn command creates a file domain, which is a logical construct containing both physical volumes (disks or disk partitions) and filesets. When you create a file domain, you must specify one volume. If the new file domain will overlap mounted file systems, swap areas, or reserved partitions, you are given the choice of continuing or aborting the command.

Existing data on the volume you assign to a new file domain is destroyed when the file domain is created.

-F Ignores overlapping partition or block warnings.

-l numpages Sets the number of pages in the log file. AdvFS rounds this number up to a multiple of four.

-o Overwrites an existing file domain, allowing you to recreate the domain structure.

-p numpages Sets the number of pages to preallocate for the bitfile metadata table (BMT). The default is 0 (zero) pages.

-r Specifies the file domain as the root domain. This prevents multiple volumes in the root domain. AdvFS supports only one volume in the root domain.

-x numpages Sets the number of pages by which the bitfile metadata table (BMT) extent size grows. The default is128 pages.

-V3 | -V4 Specifies the nature of the on-disk data structures. Default is V4. The AdvFS domain version number for Tru64 UNIX V5 is 4.

A-18 AdvFS Commands and Utilities

Page 361: Dunix Student

AdvFS Commands and Utilities

If you try to add a volume that would cause partitions to overlap with any other file system, including LSM, UFS, and AdvFS, or that would overlap with blocks that are in use, the system displays a message asking if you wish to continue. Using the -F flag disables testing for overlap.

The mkfdmn command does not create a file system that you can mount. In order to mount a file system, the file domain must contain one or more filesets. After you run the mkfdmn command, you must run the mkfset command to create at least one fileset within the new file domain. You can access the file domain as soon as you mount one or more filesets. For more information about creating filesets, see mkfset(8).

To remove a file domain, dismount all filesets in the domain you want to remove. Then use the rmfdmn command to remove the file domain. You can also remove the definition of the domain by removing the defining directory and all links under it in the /etc/fdmns directory. To accomplish this, execute the following command line:

# rm -rf /etc/fdmns/domain_name

Although you can use the advscan command to recreate the file domain links, it is good practice to maintain a current hardcopy record of each volume you have. You must have the names of all the volumes in the domain to recreate the /etc/fdmns directory by hand.You must have root user privilege to use the mkfdmn command.

You cannot have more than 100 active file domains at one time. A file domain is active when at least one fileset is mounted.

Each file domain must have a unique name of up to 31 characters. All whitespace characters (tab, line feed, space, and so on) and the / # : * ? characters are invalid for file domain names.

DIGITAL UNIX V4.x Specific mkfdmn InformationSystems with file domains that contain very large numbers of files can use more BMT extents (similar to inodes in UFS) than normal. By default, AdvFS attempts to grow the BMT by 128 pages each time additional BMT extents are needed. Frequent requests by the system to increase the BMT cause the metadata to become very fragmented, which can result in an out of disk space error.

You can reduce the amount of metadata fragmentation in one of two ways: increasing the number of pages the system attempts to grow the BMT each time more space is needed or by preallocating all of the space for the BMT when the file domain is created.

AdvFS Commands and Utilities A-19

Page 362: Dunix Student

AdvFS Commands and Utilities

To preallocate all of the BMT space you expect the file domain to need, use the mkfdmn command with the -p flag set to specify the number of pages to preallocate. Space that is preallocated for the BMT cannot be deallocated, so do not preallocate more space than you need for it. The following table provides BMT page number estimates for numbers of files.

To set the BMT to grow by more than 128 pages each time additional metadata extents are needed, use the mkfdmn command with the -x flag set to specify a number of pages greater than 128. You can increase the number of pages to any value; the following table shows suggested guidelines.

If you make a file domain using the -p or -x flags to increase the BMT extent allocations, you must use the same flag with the same number of pages when you add a volume to the file domain with the addvol command. See addvol(8) for information about adding a volume to a file domain.

Use a value in the -x num_pages argument that maintains the following ratio between the BMT extent size (the number of pages for the -x parameter) and the log file size (the number of pages for the -l parameter):

BMT extent size <= (log file size * 8184) / 4

It takes about one minute to process 5000 BMT extent size pages with the -x flag. A process that initiates a BMT extent size operation must take into account that very large values for -x will take a long time to complete.

mkfsetDescription

The mkfset command creates an AdvFS fileset within an existing domain.

/sbin/mkfset domain fileset

domain specifies the name of an existing AdvFS file domain. fileset specifies the name of the fileset to be created in the specified file domain.

Number of Files Suggested BMT Extent (pages)

BMT Size (pages)

Less than 50,000 default (128) 3,600

100,000 256 7,200

200,000 512 14,400

300,000 768 21,600

400,000 1024 28,800

800,000 2048 57,600

A-20 AdvFS Commands and Utilities

Page 363: Dunix Student

AdvFS Commands and Utilities

t in the t. For

Operation

You must create at least one fileset per file domain; however, you can create multiple filesets within a file domain. You can mount and unmount each fileset independently of the other filesets in the file domain. You can assign fileset quotas (block and file usage limits) to filesets. You must have root user privilege to use this utility.

Each fileset within a domain must have a unique name of up to 31 characters. All whitespace characters (tab, new line, space and so on) and the / # : * ? characters are invalid for fileset names.

Tru64 UNIX supports an unlimited number of filesets per system; only 512 filesets can be mounted at one time.

mountlistDescription

The mountlist command checks for mounted AdvFS filesets.

/sbin/advfs/mountlist [-v]

Options

Operation

The mountlist command is used by the setld -d function. The /usr.smdb./OSFADVFS***.scp routine calls this command to check for mounted filesets before proceeding with the installation.

The exit status from mountlist is 0 if no mounted AdvFS filesets are found. An exit status of 1 indicates either an error occurred or mounted AdvFS filesets were found. You must have root user privilege to use this utility.

ncheckDescription

The ncheck command lists i-number or tag and path name for all files in a file system.

/usr/sbin/ncheck [-i numbers] [-asm] filesystem

filesystem specifies one or more file systems. Specify any file system by entering its full path name. The full path name is the file system’s mount pointhe /etc/fstab file. You can also specify a UFS file system by entering thename of its device special file. You can specify an AdvFS fileset by enteringname of the file domain, a pound sign (#) character, and the name of the fileseexample: root_domain#root.

-v Prints a list of the mounted filesets

AdvFS Commands and Utilities A-21

Page 364: Dunix Student

AdvFS Commands and Utilities

Options

Operation

The ncheck command with no flags generates a list of all files on every specified file system. The list includes the path name and the corresponding i-number or tag of each file. Each directory file name in the list is followed by a /. (slash dot). Use the available flags to customize the list to include or exclude specific types of files.

The files are listed in order by i-number or tag. To sort the list in a more useful format, pipe the output to the sort command.You must have root user privilege to access this command.

The ncheck command checks the /etc/fstab file for the specified domain and file system entry. If there is no entry in /etc/fstab for the specified file system, an error message is displayed to indicate that the file does not exist.

nvbmtpgDescription

The nvbmtpg command displays pages of an AdvFS BMT file. This command is new in Tru64 UNIX V5.0. This command should be used in place of the vbmtpg and vbtmchain commands provided in earlier releases.

/sbin/advfs/nvbmtpg [-R] [-v] { domain_id | bmt_id } [-f]/sbin/advfs/nvbmtpg [-R] [-v] bmt_id page [-f]/sbin/advfs/nvbmtpg [-R] [-v] bmt_id page mcell [-c]/sbin/advfs/nvbmtpg [-R] [-v] bmt_id [-a]/sbin/advfs/nvbmtpg [-R] [-v] bmt_id fileset_id [ file_id ] [-c]/sbin/advfs/nvbmtpg [-R] [-v] bmt_id -s b block/sbin/advfs/nvbmtpg [-R] [-v] domain_id fileset_id -s f frag/sbin/advfs/nvbmtpg [-R] [-v] volume_id -b block [ mcell]/sbin/advfs/nvbmtpg [-R] volume_id -d dump_file

-a Includes in the list the path names . (dot) and .. (dot dot), which are ordinarily suppressed.

-i numbers Lists only those files with the specified i-numbers (UFS) or tags (AdvFS).

-m Includes in the list the mode, UID, and GID of the files. To use this flag you must also specify the -i or the -s flag on the command line.

-s Lists only the special files and files with set-user-ID mode.

A-22 AdvFS Commands and Utilities

Page 365: Dunix Student

AdvFS Commands and Utilities

bmt_id specifies the BMT file on an AdvFS volume or a BMT file that has been saved as a dump_file. Use the following format if you want to specify a dump file: volume_id | [-F] dump_file

Specify the -F flag to force the command to interpret the name you supply as a file name. domain_id specifies an AdvFS file domain using the following format:

[-r] [-D] domain

Specify the -r flag to operate on the raw device (character device special file) of the domain instead of the block device. Specify the -D flag to indicate the domain argument is to be used as a domain name.

volume_id specifies an AdvFS volume using the following format:

[-V] volume | domain_id volume_index

volume specifies the name of an AdvFS volume in an AdvFS file domain. volume_index specifies the index number of a volume in an AdvFS file domain. Specify the -V flag to indicate the volume argument is to be used as the volume name. The volume argument also can be a full or partial path for the volume, for example /dev/disk/dsk12a or dsk12a. Alternatively, specify the volume by using arguments for its domain, domain_id, and its volume index number, volume_index.

fileset_id specifies an AdvFS fileset using the following format:

[-S] fileset | -T fileset_tag

Specify the -S flag to indicate the fileset argument is to be used as the fileset name. Specify the fileset by entering either the name of the fileset, fileset, or the fileset’s tag number, -T fileset_tag.

file_id specifies a file name in the following format:

file | [-t] file_tag

Specify the file by entering either the file’s pathname, file, or the file’s tag number, -t file_tag.

dump_file specifies the name of a file that contains the output from this utility. mcell specifies the number of a metadata cell (mcell) from a file. page specifies the file page number of a file.

AdvFS Commands and Utilities A-23

Page 366: Dunix Student

AdvFS Commands and Utilities

Options

Operation

The nvbmtpg utility formats, dumps, and displays pages of the bitfile metadata table (BMT) files. BMTs are composed of mcells. Each file in an AdvFS domain is described by a collection of mcells. The mcells for each file are chained together. The first mcell in a chain is called the primary mcell.

AdvFS creates one BMT file for each AdvFS volume in an AdvFS file domain. AdvFS first creates a BMT file on the volume you specify when you run the mkfdmn utility to create an AdvFS file domain. As you add volumes to the domain with the addvol utility, a BMT file is created on each added volume.

-a Specifies that all the pages in the BMT be displayed.

-b block Specifies the logical block number of a disk block on an AdvFS volume.

-c Displays the entire chain of mcells.

-d dumpfile Specifies the name of a file that will hold the contents of the specified BMT file.

-F Forces the command to interpret the name you supply as a file name.

-f Displays the number of free mcells.

-l Displays the deferred delete list of mcells.

-R Specifies that information about the RBMT is to be displayed.

-s b block Specifies logical block number of a disk block on AdvFS volume. When you use this flag, the utility searches the specified BMT file for a mcell that has an extent record for a file that contains the specified block.

-s f frag Specifies the number of a file fragment in the frag file for a fileset. When you use this flag, the utility searches all BMT files (there is one on each AdvFS volume) for a mcell that:

• Belongs to a file in the specified fileset

• Has an attribute record that indicates the file is using the specified frag ID.

-s t tag Specifies the file tag number. The utility searches one or all of the BMT files for a mcell with this tag.

-T fileset_tag Specifies the tag number for a fileset.

-t file_tag Specifies the tag number for a file.

-v Displays all the data in a specified mcell.

volume Specifies the name of an AdvFS volume in an AdvFS file domain.

volume_index Specifies the index number of a volume in an AdvFS file domain.

A-24 AdvFS Commands and Utilities

Page 367: Dunix Student

AdvFS Commands and Utilities

The BMT file for a volume never migrates from the volume. When you remove a volume from a domain, the BMT file on the removed volume is, like the volume itself, no longer accessible.

A BMT file is an array of 8 Kbyte file pages, each page containing a header and an array of metadata cells (mcells). The purpose of a BMT file is to contain all the metadata for all files that are stored on an AdvFS volume.

You can use this command to:

• Display a summary of the BMT on one AdvFS volume or a summary of all the BMT files (there is one per volume) in a domain.

• Display a page of mcells or one mcell or a chain of mcells. The page can be specified by BMT page number or volume block number. An mcell can be specified by a number or by specifying the primary mcell of a file.

• Search for a mcell based on an extent that maps a volume block or a file that uses a given frag ID.

See nvbmtpg(8) for more information.

It can be misleading to use this utility with the -r flag on a domain with mounted filesets. The utility does not synchronize its read requests with AdvFS file domain read and write requests. For example, the AdvFS can be writing to the disk as the utility is reading from the disk. Therefore, metadata may not have been flushed in time for the utility to read it and consecutive reads of the same file page may return unpredictable or contradictory results. To avoid this problem, do not use the -r flag on an active domain.

The utility can fail to open a block device, even when there are no filesets mounted for the domain and the AdvFS daemon advfsd is running. The daemon, as it runs, activates the domain for a brief time. If the nvbmtpg utility fails in this situation, run it again.

nvfragpgDescription

The nvfragpg command displays the pages of an AdvFS frag file. This command is new in Tru64 UNIX V5.0. This command should be used in place of the vfragpg command provided in earlier releases.

/sbin/advfs/nvfragpg [-v] [-f] frag_id/sbin/advfs/nvfragpg [-v] [-f] frag_id page/sbin/advfs/nvfragpg volume_id -b block/sbin/advfs/nvfragpg [-v] [-f] domain_id fileset_id -d dump_file

frag_id specifies a frag file using the following format:

domain_id fileset_id | [-F] dump_file

AdvFS Commands and Utilities A-25

Page 368: Dunix Student

AdvFS Commands and Utilities

The dump_file is a previously-saved copy of a frag file. Use the -F flag to force the utility to interpret the dump_file as a file name when it has the same name as a domain name.

domain_id specifies an AdvFS file domain using the following format:

[-r] [-D] domain

Specify the -r flag to operate on the raw device (character device special file) of the domain instead of the block device. Specify the -D flag to force the utility to interpret the name you supply in the domain argument as a domain name.

volume_id specifies an AdvFS volume using the following format:

[-V] volume | domain_id volume_index

Specify the -V flag to force the utility to interpret the name you supply in the volume argument as a volume name. The volume name argument also can be a full or partial path name, for example /dev/disk/dsk12a or dsk12a. Alternatively, specify the volume by using arguments for its domain, domain_id, and its volume index number, volume_index.

fileset_id specifies an AdvFS fileset using the following format:

[-S] fileset | -T fileset_tag

Specify the -S flag to force the command to interpret the name you supply as a fileset name. Specify the fileset by entering either the name of the fileset, or the fileset’s tag number, -T fileset_tag.

file_id specifies a file name in the following format:

file | [-t] file_tag

page specifies the file page number of a file.

Options

Operation

Use the nvfragpg utility to display information about frag file metadata.

-b block Specifies logical block number of a disk block on an AdvFS volume.

-d dumpfile Specifies the name of a file that contains the output of this utility.

-f Displays the frag file free list.

-v Displays all the data in a frag file.

A-26 AdvFS Commands and Utilities

Page 369: Dunix Student

AdvFS Commands and Utilities

Each fileset in an AdvFS domain has one frag file. Frag files are collections of file fragments. The collections of file fragments in a frag file are called groups, because the file fragments are grouped by file fragment size: file fragments of 1 Kbyte or less are collected in one group; file fragments more than 1 Kbyte up to 2 Kbytes are collected in another group; and so on, up to a group that contains file fragments that are more than 7 Kbytes up to 8 Kbytes.

The first 1024 bytes of each group in a frag file contains the metadata for the file fragments in the group. A group is never larger than 128 Kbytes, so a group that collects 1 Kbyte fragments can hold at most 127 fragments, a group that collects 2 Kbyte fragments can hold at most 63 fragments, and so on. A group that collects 8 Kbyte fragments can hold at most 15 fragments.

You can use the nvfragpg command to:

• Display a summary

• Display a single frag file page

• Display corrupted volumes

• Save a frag file

For more information, see nvfragpg(8).

It can be misleading to use this utility with the -r flag on a domain with mounted filesets. The utility does not synchronize its read requests with AdvFS file domain read and write requests. For example, the AdvFS can be writing to the disk as the utility is reading from the disk. Therefore, metadata may not have been flushed in time for the utility to read it and consecutive reads of the same file page may return unpredictable or contradictory results. To avoid this problem, do not use the -r flag on an active domain.

The utility can fail to open a block device, even when there are no filesets mounted for the domain and the AdvFS daemon, advfsd is running. The daemon, as it runs, activates the domain for a brief time. If the nvfragpg utility fails in this situation, run it again.

nvlogpgDescription

The nvlogpg command displays the log file of an AdvFS file domain. This command is new in Tru64 UNIX V5.0 and should be used in place of the vlogpg and vlsnpg commands provided in DIGITAL UNIX Version 4.0x releases and in place of the logread command in DIGITAL UNIX Version 3.2x releases.

/sbin/advfs/nvlogpg log_id/sbin/advfs/nvlogpg [-v | -B] log_id page [record_offset [-f]]/sbin/advfs/nvlogpg [-v | -B] log_id [-R | -a ]/sbin/advfs/nvlogpg [-v | -B] log_id [-R | -a] page_offset/sbin/advfs/nvlogpg domain_id | volume_id -d dump_file/sbin/advfs/nvlogpg [-v | -B] volume_id -b block

AdvFS Commands and Utilities A-27

Page 370: Dunix Student

AdvFS Commands and Utilities

log_id specifies a log file in an AdvFS domain or a log file that has been saved by the utility as a dump_file. Use the following format:

domain_id | volume_id [-F] dump_file

Specify the -F flag to force the utility to interpret the name you supply as a file name.

domain_id specifies an AdvFS file domain using the following format:

[-r] [-D] domain

Specify the -r flag to operate on the raw device (character device special file) of the domain instead of the block device. Specify the -D flag to force the utility to interpret the name you supply in the domain argument as a domain name.

volume_id specifies an AdvFS volume using the following format:

[-V] volume | domain_id volume_index

Specify the -V flag to force the utility to interpret the name you supply in the volume argument as a volume name. The volume name argument also can be a full or partial path for the volume, for example /dev/disk/dsk12a or dsk12a. Alternatively, specify the volume by using arguments for its domain, domain_id, and its volume index number, volume_index.

dump_file specifies the name of a file that contains the output from this utility.

page specifies the file page number of a file. page_offset specifies the offset in the log file. record_offset specifies a byte offset in a page of the log file.

Options

-a Specifies that all the pages in the log file be displayed.

-B Specifies that only the transaction id for each log file entry be displayed.

-b block Specifies the logical block number of a disk block on an AdvFS volume.

-d dump_file Specifies the name of a file that will hold the contents of the specified log file.

-e Specifies that the last active record in the log file is to be displayed.

-f Specifies that all subtransactions of the parent transaction are to be followed.

-s Specifies that the first active record in the log file (the start of the log file) is to be displayed.

-v Displays all the data in a specified log.

A-28 AdvFS Commands and Utilities

Page 371: Dunix Student

AdvFS Commands and Utilities

Operation

The nvlogpg command locates the log file of an AdvFS file domain and displays records from it in various ways.

The log file for a domain is a bitfile, organized as an array of 8KB disk pages. Each page consists of a fixed-size header record, a number of variable-sized data records, and a variable-sized trailer record. Each data record consists of a fixed-size header and a variable amount of data.

The log file for a domain contains the metadata, the log, of each transaction. Before a transaction is written to disk, its logged metadata is written to disk. Because the log of a transaction contains the information necessary to redo the transaction, the file system can maintain consistency on disk and recover from transaction failures when they occur. These transactions and the metadata they include are used to replay transactions that did not complete, for example if the system crashed, when the domain is next activated.

Using this command you can:

• Display a summary

• Display log file pages and records

• Save and examine the log file

It can be misleading to use this utility with the -r flag on a domain with mounted filesets. The utility does not synchronize its read requests with AdvFS file domain read and write requests. For example, the AdvFS can be writing to the disk as the utility is reading from the disk. Therefore, metadata may not have been flushed in time for the utility to read it and consecutive reads of the same file page may return unpredictable or contradictory results. To avoid this problem, do not use the -r flag on an active domain.

The utility can fail to open a block device, even when there are no filesets mounted for the domain and the AdvFS daemon, advfsd, is running. The daemon, as it runs, activates the domain for a brief time. If the nvlogpg utility fails in this situation, run it again.

nvtagpgDescription

The nvtagpg command displays a page formatted as a tag file page. This command is new in Tru64 UNIX V5.0. This command should be used in place of the vtagpg command provided in earlier releases.

/sbin/advfs/nvtagpg [-v] tag_id/sbin/advfs/nvtagpg [-v] tag_id | {page | -a}/sbin/advfs/nvtagpg [-v] fileset_id file_id/sbin/advfs/nvtagpg domain_id fileset_id -d dump_file/sbin/advfs/nvtagpg domain_id -d dump_file/sbin/advfs/nvtagpg volume_id -b block

AdvFS Commands and Utilities A-29

Page 372: Dunix Student

AdvFS Commands and Utilities

ou

of

a full

a et, or

tag_id specifies a tag file using the following format:

roottag_id | fileset_id

The roottag_id parameter specifies the root tag file using the following format:

domain_id | [-F] dump_file

The dump_file parameter is a previously-saved copy of the fileset’s tag file. Ycan use the -F flag to force the utility to interpret the dump_file parameter as afile name if it has the same name as a domain name.

filesettag_id specifies a fileset tag file using the following format:

domain_id fileset_id | [-F] dump_file

domain_id specifies an AdvFS file domain using the following format:

[-r] [-D] domain

Specify the -r flag to operate on the raw device (character device special file)the domain instead of the block device. Specify the -D flag to force the utility to interpret the name you supply in the domain argument as a domain name.

volume_id specifies an AdvFS volume using the following format:

[-V] volume | domain_id volume_index

Specify the -V flag to force the utility to interpret the name you supply in the volume argument as a volume name. The volume name argument also can beor partial path for the volume, for example /dev/disk/dsk12a or dsk12a. Alternatively, specify the volume by using arguments for its domain, domain_id, and its volume index number, volume_index.

fileset_id specifies an AdvFS fileset using the following format:

[-S] fileset | -T fileset_tag

Specify the -S flag to force the command to interpret the name you supply as fileset name. Specify the fileset by entering either the name of the fileset, filesthe fileset's tag number, -T fileset_tag.

file_id specifies a file name in the following format:

[-F] file | [-t] file_tag

A-30 AdvFS Commands and Utilities

Page 373: Dunix Student

AdvFS Commands and Utilities

Specify the -F flag to force the command to interpret the name you supply as a file name. Specify the file by entering either the file’s pathname, file, or the file’s tag number, -t file_tag.

page specifies the file page number of a file.

Options

Operation

The nvtagpg utility displays formatted pages of a root tag file or a fileset tag file. The utility can also save a copy of a tag file.

Each AdvFS domain has a root tag file that lists all the filesets in the domain. Each fileset has a tag file that lists all the files in the fileset.

Use the nvtagpg command to:

• Display a root tag file

• Display a fileset tag

• Save the tag file

• Display corrupted AdvFS volumes

It can be misleading to use this utility with the -r flag on a domain with mounted filesets. The utility does not synchronize its read requests with AdvFS file domain read and write requests. For example, the AdvFS can be writing to the disk as the utility is reading from the disk. Therefore, metadata may not have been flushed in time for the utility to read it and consecutive reads of the same file page may return unpredictable or contradictory results. To avoid this problem, do not use the -r flag on an active domain.

The utility can fail to open a block device, even when there are no filesets mounted for the domain and the AdvFS daemon, advfsd is running. The daemon, as it runs, activates the domain for a brief time. If the nvtagpg utility fails in this situation, run it again.

-a Specifies that all the pages in the file be displayed.

-b block Specifies the logical block number of a disk block on an AdvFS volume.

-d dump_file Specifies the name of a file that will hold the contents of the specified tag file.

-v Displays all the data in a specified tag file.

AdvFS Commands and Utilities A-31

Page 374: Dunix Student

AdvFS Commands and Utilities

rmfdmnDescription

The rmfdmn command removes a file domain.

/sbin/rmfdmn [-f] domain

domain specifies the name of an existing file domain.

Options

Operation

The rmfdmn utility enables you to remove an unused file domain. Before you can remove a file domain, unmount all filesets and clone filesets from the domain using the umount command. If you try to remove a file domain that has mounted filesets, the system displays an error message indicating that a fileset is mounted. AdvFS will not remove the file domain.

The -f flag is useful for scripts when you do not want to be queried for each file domain. If you choose the -f flag, no message prompt will display. The rmfdmn command will operate as if you responded yes to the prompt.

You must have root user privilege to use this command.

You must update the /etc/fdmns directory to delete the file domain entry for the deleted file domain.

rmfsetDescription

The rmfset command removes a fileset or a clone fileset from an AdvFS file domain.

/sbin/rmfset [-f] domain fileset

domain specifies the name of an existing AdvFS file domain. fileset specifies the name of the fileset to be removed from the specified file domain.

Options

Operation

The rmfset command removes a fileset (and all of its files) from an existing AdvFS file domain.

-f Turns off the message prompt

-f Turns off the message prompt

A-32 AdvFS Commands and Utilities

Page 375: Dunix Student

AdvFS Commands and Utilities

Unmount the fileset before removing it with the rmfset command. A fileset or clone fileset cannot be removed with this command if it is mounted. A fileset that has a clone fileset cannot be removed with this command until the clone fileset has been removed.

The -f flag is useful for scripts or when you do not want to be queried about each fileset. If you choose the -f flag, no prompts are displayed.

You must have root user privilege to use this command.

rmvolDescription

The rmvol command removes a volume from an existing AdvFS file domain.

/usr/sbin/rmvol [-f][-v] special domain

special specifies the block device special file name, such as /dev/disk/dsk2c, of the volume that you are removing from the file domain. domain specifies the name of an existing AdvFS file domain.

Options

Operation

The rmvol utility enables you to decrease the number of volumes within an existing file domain. When you attempt to remove a volume, the file system automatically migrates the contents of that volume to another volume in the file domain.

The logical structure of the filesets in a file domain is unaffected when you remove a volume. If you remove a volume that contains a stripe segment, the rmvol utility moves the segment to another volume that does not already contain a stripe segment of the same file. If a file is striped across all volumes in the file domain, the utility requests confirmation before placing a second stripe segment on a volume that has one.

Before you can remove a volume from a file domain, all filesets in the file domain must be mounted. If you try to remove a volume from an active file domain that includes unmounted filesets, the system displays an error message indicating that a fileset is unmounted. This message is repeated until you mount all filesets in the file domain.

If you attempt to remove a volume from an inactive file domain, the system returns the ENO_SUCH_DOMAIN error message. A file domain is inactive when none of its filesets is mounted. In this case, the rmvol command does not remove the volume.

-f Turns off the message prompt.

-v Displays messages that describe which files are moved off the specified volume. Using this flag slows the rmvol process.

AdvFS Commands and Utilities A-33

Page 376: Dunix Student

AdvFS Commands and Utilities

If there is not enough free space on other volumes in the file domain to accept the offloaded files from the departing volume, the rmvol utility moves as many files as possible to free space on other volumes. Then a message is sent to the console indicating that there is not enough space to complete the procedure. The files that were not yet moved remain on the original volume.

You can interrupt the rmvol process without damaging your file domain. AdvFS will stop removing files from the volume. Files already removed from the volume will remain in their new location. Interrupting a rmvol operation with the kill command can leave the volume in an inaccessible state. If a volume does not allow new allocations after an rmvol operation, use the chvol command with the -A flag to reactivate the volume.

You cannot run the rmvol utility while the defragment, balance, rmfset, or rmvol utility is running on the same domain.

You must have root user privilege to use this utility.

salvageDescription

The salvage command recovers file data from damaged AdvFS file domains. This is a new command in the Tru64 UNIX Version 5.0 release. (A field test version of salvage is in the DIGITAL UNIX Version 4.0D release.)

/sbin/advfs/salvage [-x|-p] [-l] [-S] [-v number] [-d time] [-D directory] [-L path] [-o option] { -V special [-V special]... | domain } [fileset[path]]

domain specifies the name of an existing AdvFS file domain from which filesets are to be recovered. Use this parameter when you want the utility to obtain volume information from the /etc/fdmns directory. The volume information used by the utility consists of the device special file names of the AdvFS volumes in the file domain. When the domain parameter is specified without optional arguments, the utility attempts to recover the files in all filesets in the domain.

Do not use this parameter when you want to use the -V special flag to specify device special file names of AdvFS volumes. If you do, the utility displays an error message and exits with an exit value of 2.

fileset [path] specifies the name of a fileset to be recovered from a domain or a volume.

Specify path to indicate the path of a directory or file in a fileset. When you specify a path that is a directory, the utility attempts to recover only the files in that directory tree, starting at the specified directory. When you specify a path that is a file, the utility attempts to recover only that file. Specify path relative to the mount point of the fileset.

A-34 AdvFS Commands and Utilities

Page 377: Dunix Student

AdvFS Commands and Utilities

ch

pen in

over

lue

rites

ected

he

the

lag,

Options

-d time Specifies the time, as a decimal number in this format: [[CC]YY]MMDDhhmm[.SS]

-D directory Specifies the path of the directory to which all recovered files are written. If you do not specify a directory, the utility writes recovered files to the current working directory.

-l Specifies verbose mode for messages written to the log file for every file that is encountered during the recovery. If you do not specify this flag, the utility writes a message to the log file only for partially recovered and unrecovered files.

-L path Specifies the path of the directory or the file name for the log file you choose to contain messages logged by this utility. If you include a log file name in the path, the utility uses that file name. If no log file name is specified, the utility places the log file in the specified directory and names it salvage.log.pid (PID is the process ID of the user process). When you do not specify this flag, the utility places the log file in the current working directory and names it salvage.log.pid.

-o option Specifies the action the utility takes when a file being recovered already exists in the directory to which it is to be written. The values for option are:

• yes, overwrite the existing file without querying the user. This is the default action when option is not specified.

• no, do not overwrite the existing file.

• ask, ask the user whether to overwrite the existing file.

-p Specifies that the utility identifies a partially covered file by appending a ’.partial’ to its file name.

-S Specifies that the utility is to run in sequential search mode, checking each page on eavolume in the domain. This mode of operation will take a long time on large AdvFS filedomains. This flag can be used to recover most files from a domain which has been damaged from an incorrect execution of the mkfdmn utility. In some cases, the recovery will need to generate names based on the file's tag number. These cases usually hapthe root directory, because mkfdmn usually overwrites this directory.

When you specify this flag, there may be a security issue, because the utility could recold filesets and deleted files.

-F format Specifies that salvage should recover files in an archive format. The only legitimate vafor format is ’tar’.

-f [archive] Salvage uses the next argument as the name of an archive. If the name is ’-’, salvage wto standard output.

-v number Specifies the type of messages directed to stdout. If you do not specify this flag, the default is to direct only error messages to stdout. If you specify number to be 1, both errors and the names of partially recovered files are directed to stdout. If you specify number to be 2, error messages and the status of all files as they are recovered are dirto stdout.

-V special [-V special]

Specifies block device special file names of volumes in the domain, /dev/disk/dsk3c. The utility attempts to recover files only from the volumes you specify. If you do not specify the -V flag, you must specify the domain parameter so that the utility can obtain tspecial file names of the volumes in the domain from the /etc/fdmns directory. Do not use this flag with the domain parameter. If you do, an error message is displayed andutility exits with an exit value of 2.

-x Specifies that partially recoverable files are not to be recovered. If you do not use this fpartially recoverable files are recovered. Do not use the -x flag with the -p flag. If you do, the utility displays an error message and exits with an exit value of 2.

AdvFS Commands and Utilities A-35

Page 378: Dunix Student

AdvFS Commands and Utilities

Operation

The salvage utility helps you recover file data after an AdvFS file domain has become unmountable due to some type of data corruption. Errors that could cause data corruption of a file domain include I/O errors in file system metadata, the accidental removal of a volume, or any I/O error that produces a panic.

Use the salvage utility as a last resort. You should first repair domain structures by using the verify utility. If that repair method is unsatisfactory, attempt to recover fileset data from backup media. Only if both methods are unsatisfactory should you employ the salvage utility.

The salvage utility opens and reads block devices directly and could present a security issue if it recovers data remaining from previous AdvFS file domains while attempting to recover data from current AdvFS file domains.

The salvage utility can be run in single user mode, without mounting other file systems. The salvage utility is available from the UNIX Shell option when you are booting from the Tru64 UNIX Operating System Volume 1 CDROM.

The salvage utility can find metadata on disk that appears valid but might not be: in most cases, the utility can determine when this suspect metadata should be used or ignored. One of these problems that the utility cannot detect is the situation when the metadata contains a tag number that could be valid on a fileset with a very large number of files, but is usually invalid for common filesets. In this case, the utility creates a partial file in the lost+found directory.

The salvage utility has a built-in soft limit on the number of valid tags in a fileset: 10,000,000 tags. If an application should exceed this soft limit, the user is prompted about increasing the limit.

You must have root user privilege to use the salvage utility.

Before using the salvage utility, all filesets in the domain you are trying to recover probably have already been unmounted. However, use the umount(8) command to ensure that the filesets are unmounted.

savemetaDescription

savemeta [-LSTtf] domain savedir

This script saves a snapshot of the specified domain metadata into a directory, savedir, that has the following structure:

/savedir/volume_directory/BMT_file

/log_file

/tag_file

/fileset_directory/frag_file

A-36 AdvFS Commands and Utilities

Page 379: Dunix Student

AdvFS Commands and Utilities

/tag_file

Options

shblkDescription

The shblk command displays unformatted disk blocks.

/sbin/advfs/shblk [-sb start_block] [-bc block_count] special

special specifies the volume on which the block(s) are located.

Options

Operation

The shblk command displays an unformatted hexadecimal listing of the information that is present in the selected blocks.

You must have root user privileges to access this command.

shfragbfDescription

Use this command to display how much space is used on the frag file.

/sbin/advfs/shfragbf file_system /.tags/l

file_system specifies the fileset mount point of the file system to display.

Operation

This command also displays the following frag file information:

• The frag type is listed as 0K when the frag file is not in use. The type is listed as 1K for 1K frags, and so forth.

• Grps specifies the number of groups of the frag type.

• bad specifies the number of bad group headers of this type.

-L Does not write the domain’s log file to the savedir.

-S Does not save the volume’s SBM to the savedir.

-T Does not save the domain’s root tag file to the savedir.

-t Does not save the fileset tag files to the savedir.

-f Saves the structure information from the frag file in each fileset to the savedir.

-sb start_block Specifies the volume on which the block(s) are located.

-bc block_count Specifies the number of blocks to print.

AdvFS Commands and Utilities A-37

Page 380: Dunix Student

AdvFS Commands and Utilities

• Frags specifies the number of fragments of this type.

• free specifies the number of free frags of this type.

• in-use specifies the number of fragments in use.

• Bytes specifies the total bytes in this frag type.

You must have root user privilege to access this command.

showfdmnDescription

The showfdmn command displays the attributes of a file domain and detailed information about each volume in the file domain.

/sbin/showfdmn [-k] domain

domain specifies the name of an existing AdvFS file domain.

Options

Operation

The showfdmn command displays the following file domain attributes:

• Id is a unique number (in hexadecimal format) that identifies the file domain.

• Date Created is the day, month, and time that a file domain was created.

• LogPgs is the number of 8-kilobyte pages in the transaction log of the specified file domain.

• Version is an internal-use-only version number for the AdvFS on-disk data structures. This number is not related to the version number of the base operating system.

• Domain Name is the name of the file domain.

The command also displays the following volume information:

• Vol is the volume number within the file domain. An L next to the number indicates that the volume contains the transaction log.

• 512-Blks is the size of the volume in 512-byte blocks

• 1K-Blks is the size of the volume in 1K blocks.

• Free is the number of blocks in a volume that are available for use.

• % Used is the percent of the volume space that is currently allocated to files or metadata.

-k Displays the total number of blocks and the number of free blocks in terms of 1K blocks instead of the default 512-byte blocks.

A-38 AdvFS Commands and Utilities

Page 381: Dunix Student

AdvFS Commands and Utilities

m.

• Cmode is the I/O consolidation mode. The default is on.

• Rblks is the maximum number of 512-byte blocks read from the volume at one time.

• Wblks is the maximum number of 512-byte blocks written to the volume at one time.

• Vol Name is the name of the special device file for the volume.

For multivolume file domains, the showfdmn command also displays the total volume size, total number of free blocks, and the total percent of volume space currently allocated.

A file domain must be active before the showfdmn command can display volume information. A file domain is active when at least one fileset in the file domain is mounted.

showfileDescription

The showfile command displays the attributes of one or more AdvFS files.

/usr/sbin/showfile [-i] [-h | -x] filename...

filename... is one or more directory or file names. If you do not supply filename arguments, you can use an asterisk (*) to display all the files in the current directory.

Options

Operation

The showfile command also displays the extent map of each file. An extent is a contiguous area of disk space that the file system allocates to a file. Simple files have one extent map; striped files have an extent map for every stripe segment.

You can list AdvFS attributes for an individual file or the contents of a directory. Although the showfile command lists both AdvFS and non-AdvFS files, the command displays meaningful information for AdvFS files only.

The showfile command displays the following file attributes:

• Id is the unique number (in hexadecimal format) that identifies the file. Digits to the left of the dot (.) character are equivalent to a UFS inode.

-h Displays the raw extent map including any holes.

-i When a filename is a directory, displays the attributes for the directory’s index file.(V5.x only)

-x Displays full storage allocation map (extent map) for files in an Advanced File Syste

AdvFS Commands and Utilities A-39

Page 382: Dunix Student

AdvFS Commands and Utilities

n

f the e ates

n s

xtent

• Vol is the location of primary metadata for the file, expressed as a number. The data extents of the file can reside on another volume.

• PgSz is the page size in 512-byte blocks.

• Pages is the number of pages allocated to the file.

• XtntType is the extent type. The extent type can be simple, which is a regular AdvFS file without special extents; stripe, which is a striped file; symlink, which is a symbolic link to a file; usf, nfs, and so on. The showfile command cannot display attributes for symbolic links or non-AdvFS files.

• Segs is the number of stripe segments per striped file, which is the number of volumes a striped file crosses. (Applies only to stripe type.)

• SegSz is the number of pages per stripe segment. (Applies only to stripe type.)

• I/O is the type of write request to this file:

— async specifies that write requests are buffered (the AdvFS default).

— synch specifies forced synchronous writes as described in chfile(8).

— ftx specifies that write requests are executed under AdvFS transactiocontrol (reserved for metadata files and directories).

• Perf is the efficiency of file-extent allocation, expressed as a percentage ooptimal extent layout. A high percentage, such as 100%, indicates that thAdvFS I/O system has achieved optimal efficiency. A low percentage indicthe need for file defragmentation.

• File is the name of the directory or file. If the file is a directory that has aindex file associated with it and the -i flag has not been specified, the statisticdisplayed are for the directory. The term index follows the directory name. If the file is a directory that has an index file associated with it and the -i flag is specified, the statistics displayed are for the index file associated with thedirectory. The name of the directory follows the index.

Whereas a simple file has one extent map, a striped file has more than one emap. An extent map displays the following information:

• pageOff is the starting page number of the extent.

• pageCnt is the number of pages in the extent.

• vol is the location of the extent, expressed as a number.

• volBlock is the starting block number of the extent.

• blockCnt is the number of blocks in the extent.

• extentCnt is the number of extents.

A-40 AdvFS Commands and Utilities

Page 383: Dunix Student

AdvFS Commands and Utilities

rent

set

e ed .

and not t be

showfsetsDescription

The showfsets command displays the filesets (or clone filesets) and their characteristics in a specified domain.

/sbin/showfsets [-b | -q] [-k] domain [fileset...]

domain specifies the full path name of the file domain. fileset... specifies the name of one or more filesets.

Options

Operation

The following fileset characteristics are displayed:

• Fileset identifier is a combination of the file-domain identifier and an additional set of numbers that identify the fileset within the file domain.

• Clone status can include:

— Clone is specifies the name of a clone fileset, if one exists for the pafileset.

— Clone of specifies the name of the parent fileset, if the displayed fileis a clone fileset.

— Revision specifies the number of times you revised a clone fileset.

• Files specifies the number of files in the fileset and the current file usaglimits (quotas). SLim, the soft limit, is a quota that can be exceeded for a fixperiod of time and HLim, the hard limit, is a quota that cannot be exceeded

• Blocks specifies the number of blocks that are in use by a mounted filesetthe current block soft and hard usage limits (quotas). For filesets that are mounted, zero blocks will display. For an accurate display, the fileset musmounted.

• Quota Status specifies which quota types are enabled (enforced).

-b Lists the names of the filesets in a domain, without additional detail.

-k Displays the total number of blocks and the number of free blocks in terms of 1K blocks instead of the default 512-byte blocks.

-q Displays quota limits for filesets in a domain.

AdvFS Commands and Utilities A-41

Page 384: Dunix Student

AdvFS Commands and Utilities

The showfsets command with the -q flag set displays block and file information for a specified domain or for one or more named filesets in the domain. The characteristics of a named fileset are:

• BF (block flag) specifies block (B) and file (F) usage limits. A + in this field means that the soft block usage is exceeded; a * means that the hard limit is reached.

• Block (512) Limits specifies the number of blocks used, the soft limit (the number of blocks that can be exceeded for a period of time), the hard limit (the number of blocks that cannot be exceeded), and the grace period (the remaining time for which the soft limit may be exceeded).

• File Limits specifies the number of files used, the soft and hard file limits for the fileset, and the grace period remaining.

stripeDescription

The stripe utility enables you to improve the read/write performance of a file by spreading it evenly across several volumes in a file domain.

/usr/sbin/stripe -n volume_count filename

filename specifies the name of the file to stripe.

You must have root user privileges to access this command.

Operation

The stripe utility directs a zero-length file (a file with no data written to it yet) to be spread evenly across several volumes within a file domain. As data is appended to the file, the data is spread across the volumes. AdvFS determines the number of pages per stripe segment and alternates the segments among the disks in a sequential pattern.

Existing, nonzero-length files cannot be striped using the stripe utility. To stripe an existing file, create a new file, use the stripe utility to stripe the new file, and copy the contents of the file you want to stripe into the new striped file. After copying the file, delete the nonstriped file.

Once a file is striped, you cannot use the stripe utility to modify the number of disks that a striped file crosses. To change the volume count of a striped file, you can create a second file with a new volume count, and then copy the contents of the first file into the second file. After copying the file, delete the first file.

switchlogDescription

The switchlog command moves an AdvFS file domain transaction log.

/sbin/advfs/switchlog domain_name vol_id

A-42 AdvFS Commands and Utilities

Page 385: Dunix Student

AdvFS Commands and Utilities

domain_name specifies the name of an existing file domain. vol_id specifies the number of the new volume to use for the log.

Operation

The switchlog command relocates the transaction log of the specified file domain to a different volume in the same file domain. Moving the transaction log within a multivolume file domain is typically done to place the log on either a faster, less congested, or mirrored volume.

Use the showfdmn command to determine the current location of the transaction log. In the showfdmn command display, the letter L displays next to the volume that contains the log. The showfdmn command also displays all of the volumes and their volume numbers.

You must have root user privilege to execute this command.

tag2nameDescription

The tag2name command displays the path name of a file given the tag number.

/sbin/advfs/tag2name tags_directory/tag/sbin/advfs/tag2name [-r] domain fileset tag

domain specifies the name of an AdvFS file domain. fileset specifies the name of an AdvFS fileset. tags_directory specifies the relative path of the AdvFS tags directory for the fileset. tag specifies the AdvFS file tag number.

Options

Operation

Internally, AdvFS identifies files by tag numbers (similar to inodes in UFS). Internal messages, error messages, and output from diagnostic utilities usually specify a tag number in place of a file name. Use the tag2name command to determine the name and path of the file identified by a tag number.

Each mounted AdvFS fileset has a .tags directory in its mount point. To obtain the file name, specify the path to the .tags directory for the fileset, followed by the tag number. The full path name of the corresponding file will be printed to stdout.

The utility can fail to open a block device, even when there are no filesets mounted for the domain and the AdvFS daemon, advfsd, is running. The daemon, as it runs, activates the domain for a brief time. If the tag2name utility fails in this situation, run it again.

-r Specifies this flag to operate on the raw device (character device special file) of the fileset instead of the block device.

AdvFS Commands and Utilities A-43

Page 386: Dunix Student

AdvFS Commands and Utilities

t

is me,

e

p,

ted

You must have root user privilege to access this command. The tag you specify must be numeric and greater than 1.

vbmtchainDescription

The vbtmchain utility displays metadata for a file including the time stamp, extent map, and whether the file is a user directory or data file. (Valid for V4.x only. Use nvbmtpg for V5.0.)

/sbin/advfs/vbmtchain BMT_page cell special [special 2...]

BMT_page specifies the page within the bitfile metadata table (BMT) of the volume that contains the file’s mcell. cell specifies the cell of the BMT page thacontains the file’s mcell. special specifies the volume on which the file’s primary mcell is located. special 2... specifies the other volumes in this domain that may be accessed to follow the file’s mcell chain.

Operation

The file is described by the location of its primary mcell. Each mcell location composed of three parts: volume, page within the BMT file located on that voluand cell within the BMT page.

The primary mcell for the root tag directory is found in the BMT of the volumecontaining the log. To find this volume for a domain, use the showfdmn or thadvscan command. The volume marked "L" contains the log.

Certain metadata files are in fixed locations:

The vbmtchain utility displays the attributes of the file including the time stamthe extent map, and whether the file is a user directory or a data file.

You must have root user privileges to access this command.

vbmtpgDescription

The vbmtpg utility displays a complete, formatted page of the BMT for a mounor unmounted domain. (Valid for V4.x only. Use nvbmtpg for V5.0.)

/sbin/advfs/vbmtpg special [page_LBN]

Page Cell Volume

Bitfile metadata table 0 0 Every volume

Storage bitmap 0 1 Every volume

Root tag directory 0 2 Volume with log

Transaction log file 0 3 Volume with log

A-44 AdvFS Commands and Utilities

Page 387: Dunix Student

AdvFS Commands and Utilities

special specifies the volume on which the page is located. page_LBN specifies the logical block number (LBN) of the requested page; the default is 32 which is page zero of the bitfile metadata table (BMT).

Operation

The vbmtpg utility is useful for debugging when there has been some seemingly random file corruption.

Note that the vbmtchain command displays all the mcells associated with a given file whereas the vbmtpg command displays a page of information. This page may contain information for more than one file and may not provide complete information on any file.

vdfDescription

The vdf utility displays disk information for AdvFS domains and filesets. This command is new in Tru64 UNIX V5.0

/sbin/advfs/vdf [-k][-l] domain | domain#fileset

domain is the full path name of an AdvFS file domain. When a domain argument is specified, the default display contains information about: the number of disk blocks allocated to the domain; the number of disk blocks in use by the domain; and the number of disk blocks that are available to the domain.

domain#fileset is the name of an AdvFS fileset in an AdvFS file domain. When a domain#fileset argument is specified, the default display contains information about: the number of disk blocks allocated to the fileset; the number of disk blocks in use by the fileset; and the number of disk blocks that are available to the fileset. This information is in the same format as that displayed by the df command.

Options

Operation

The vdf utility is a script that reformats output from the showfdmn, showfsets, shfragbf, and df utilities in order to display information about the disk usage of AdvFS file domains and filesets. In addition, the utility computes and displays the sizes of metadata files in a domain or fileset.

The disk space used by clone filesets is not calculated. If clone filesets are present in the specified domain, the utility displays a warning message.

-k Displays disk blocks as 1024-byte blocks instead of the default of 512-byte blocks.

-l Specifies that the default information for both the domain and filesets is reformatted to show the relationships between them. For example, any domain metadata displayed is the total metadata shared by filesets in the domain.

AdvFS Commands and Utilities A-45

Page 388: Dunix Student

AdvFS Commands and Utilities

You must have root user privilege to access this command.

This command cannot be used on filesets that are NFS mounted. All filesets in a domain must be mounted in order to calculate the disk usage of the domain.

vdumpDescription

The vdump (rvdump) utility performs full and incremental backups on filesets.

/sbin/vdump -h/sbin/vdump -V/sbin/vdump -w/sbin/vdump [-0..9] [-CDNUquv] [-F num_buffers] [-T tape_num] [-b size] [-f device] [-x num_blocks] fileset/sbin/rvdump -h/sbin/rvdump -V/sbin/rvdump -w/sbin/rvdump [-0..9] [-CDNUquv] [-F num_buffers] [-T tape_num] [-b size] [-f nodename:device] [-x num_blocks] fileset

fileset specifies the full path name of a mounted AdvFS fileset to be backed up. Alternatively, specifies a mounted NFS or UFS file system. When used with the -D flag, specifies a subdirectory.

Options

-h Displays usage help for the command.

-V Displays the current version of the command.

-w Displays the filesets that have not been backed up within one week.

-0..9 Specifies the backup level. The value 0 for this flag causes the entire fileset to be backed up to the storage device. The default backup level is 9.

-C Compresses the data as it is backed up, which minimizes the saveset size.

-D Performs a level 0 backup on the specified subdirectory. This flag overrides any backup level specification in the command. If this flag is specified, the AdvFS user and group quota files and the fileset quotas are not backed up.

-N Does not rewind the storage device, when it is a tape.

-P Produces backward compatible savesets that can be read by earlier versions of the vrestore command.

-U Does not unload the storage device, when it is a tape.

-q Displays only error messages; does not display information messages.

-u Updates /etc/vdumpdates file with a timestamp entry from beginning of backup.

-v Displays the names of the files being backed up.

A-46 AdvFS Commands and Utilities

Page 389: Dunix Student

AdvFS Commands and Utilities

Operation

The vdump command backs up files and any associated extended attributes (including ACLs, see the proplist(4) and acl(4) reference pages) from a single mounted fileset or clone fileset to a local storage device.

The rvdump command backs up files and any associated extended attributes (including ACLs, see the proplist(4) and acl(4) reference pages) from a single mounted fileset or clone fileset to a remote storage device.

The vdump and rvdump commands are the backup facility for the AdvFS file system. However, the commands are file-system independent, and you can use them to back up other file systems, such as UFS and NFS.

The commands back up all files in the specified fileset that are new or changed since a certain date and produce a saveset on the storage device. The date is determined by comparing the specified backup level to previous backup levels recorded in the /etc/vdumpdates file. The default storage device is /dev/tape/tape0_d1. You can specify an alternate storage device by using the -f flag.

The commands perform either an incremental backup, level 9 to 1, or a full backup, level 0, depending on the desired level of backup and the level of previous backups recorded in the /etc/vdumpdates file. The commands back up all files that are new or have changed since the latest backup date of all backup levels that are lower than the backup level being performed. If a backup level that is lower than the specified level does not exist, the commands initiate a level 0 backup. A level 0 backup backs up all the files in the fileset.

-F num_buffers Specifies the number of in-memory buffers to use. The valid range is 2 through 64 buffers; the default is 8 buffers. The size of the in-memory buffers is determined by the value of the -b flag.

-T tape_num Specifies the starting number for the first tape. The default number is 1. The tape number is used only to prompt the operator to load another tape in the drive.

-b size Specifies the number of 1024-byte blocks per record in the saveset. The valid range is 1 through 64 blocks; the default is 60 blocks per record. The value of this flag also determines the size of the in-memory buffers.

-f device

-f node:device

Specifies the destination of the saveset. For vdump, the local destination can be a device, a file, or, when the - (dash) character is specified, standard output. For rvdump, the specification must be in the format nodename:device to specify the remote machine name that holds the device, file, or standard output.

-x num_blocks Specifies an "exclusive or" (XOR) operation each time blocks specified by num_blocks are written to saveset. XOR operation is performed on the blocks and results written to saveset as an XOR block that immediately follows the blocks. Subsequently, you can use the vrestore command to recover one of the blocks in the group should a read error occur. The valid range is 2 through 32 blocks; the default is 8 blocks. Using the -x flag creates larger savesets and increases the amount of time required to back up a file system, but offers additional protection from saveset errors.

AdvFS Commands and Utilities A-47

Page 390: Dunix Student

AdvFS Commands and Utilities

After the backup operation is complete, you can use the vrestore -t command to verify that the backup contains the files you wanted to save. This command lists the name and size of each file in the saveset without restoring them.

The vdump and rvdump commands do not back up filesets that are not mounted. Filesets backed up by using the vdump or the rvdump command must be restored by using the vrestore or the rvrestore command. The vdump and rvdump commands are not interchangeable with the dump and rdump commands. Similarly, the vrestore and the rvrestore commands are not interchangeable with the restore and rrestore commands.

The vrestore command in DIGITAL UNIX versions earlier than Version 4.0 cannot be used to restore savesets produced by the vdump command in DIGITAL UNIX Version 4.0 or higher systems.

The /etc/vdumpdates file is written in ASCII and consists of a single record per line. You must be the root user to update this file or to change any record field. If you edit the /etc/vdumpdates file, be certain that all records follow the correct format. An incorrectly formatted record in this file may make the file inaccessible for updates or reads.

See the manpage for more information.

verifyDescription

The verify command checks on-disk structures such as the bitfile metadata table (BMT), the storage bitmaps, the tag directory and the frag file for each fileset. The verify command should be used in place of the msfsck and vchkdir commands available in DIGITAL UNIX V3.2x.

/sbin/advfs/verify [-a | -f] [-l | -d] [-v | -q] [-t] [-r] [-F] domain_name

domain_name specifies the file domain.

Options

-a Checks an active domain. All filesets of a domain must be mounted.

-f Creates a symbolic link to "fix" a lost file in the /mount_point/lost+found directory; deletes any directory entries without associated files; deletes files that have storage-bitmap or extent-map problems; corrects inconsistencies in the storage bitmap.

-d Deletes lost files (that is, with no directory entry).

-D Checks a domain previously mounted with the -o dual option of the mount command.

-l Creates a symbolic link to the lost file in the /mountpoint/lost+found directory.

-v Prints file status information. Selecting this flag slows down the verify procedure.

-q Prints minimal file status information.

A-48 AdvFS Commands and Utilities

Page 391: Dunix Student

AdvFS Commands and Utilities

Operation

This command verifies that the directory structure is correct and that all directory entries reference a valid file (tag) and that all files (tags) have a directory entry.

The verify command checks the storage bitmap for double allocations and missing storage. It checks that all mcells in use belong to a bitfile and that all bitfiles have all of their mcells.

The verify command checks the consistency of free lists for mcells and tag directories. It checks that the mcells pointed to by tags in the tag directory match the corresponding mcells.

For each fileset in the specified file domain, the verify command checks the frag file headers for consistency. For each file that has a fragment, the frag file is checked to ensure that the frag is marked as in use.You must have root user privilege to access this command.

Unless you are checking the root domain, all filesets in the file domain must be unmounted. The verify command automatically mounts all of the filesets in a file domain individually. If you choose the -r option when you run the verify command on the root domain, all filesets in the root domain must be mounted.

Run verify on /root and /usr from a single user mode. To run verify in single user mode, you must first run a mount update on the root (mount -u /). To run the command from multiuser mode, dismount any file system that you have mounted as /root or /usr and make sure there is no file activity.

If you run the verify command on a fileset that has any other file system (AdvFS or otherwise) mounted on it, an error results. If you have a fileset erroneously labeled as UFS and it overlaps a fileset labeled AdvFS, an error results. You can recover from this error by changing the erroneously-labeled fileset’s fstype field from ufs to unused with the disklabel -s command. After changing the disk label, run the verify command.

If the -F option is specified and the verify command is unable to mount a fileset due to a failure of the file domain, the fileset is mounted using the mount -d option. Use this option with extreme caution and only as a last resort when you cannot mount a fileset. The mount -d option mounts an AdvFS fileset without running recovery on the file domain. Mounting without running recovery will cause your file domain to be inconsistent.

If you use the -F option, the verify command starts some recovery on the file domain before you mount it.

-t Displays the mcell totals.

-r Checks the root domain.

-F Mounts filesets of a file domain using mount -d option if there is a mount failure of the file domain. Use this flag with caution.

AdvFS Commands and Utilities A-49

Page 392: Dunix Student

AdvFS Commands and Utilities

vfileDescription

The vfile utility outputs the contents of a file from an unmounted domain. In Tru64 UNIX Version 5.0, the vfilepg command should be used in place of this command.

/sbin/advfs/vfile BMT_page cell special [special 2 ...]

BMT_page specifies the page within the bitfile metadata table (BMT) of the volume that contains the file’s mcell. cell specifies the cell of the BMT page that contains the file’s mcell. special specifies the volume on which the file’s primary mcell is located. special 2... specifies the other volumes in this domain that may be accessed to follow the file’s mcell chain.

Operation

The file is identified by the location of its primary mcell. Each mcell location is composed of three parts: volume, page within the BMT file located on that volume, and cell within the BMT page.

The primary mcell for the root tag directory is found in the BMT of the volume containing the log. To find this volume for a domain, use the showfdmn or the advscan command. The volume marked "L" contains the log.

Certain metadata files are in fixed locations:

You must have root user privilege to access this command.

vfilepgDescription

The vfilepg command displays pages of an AdvFS file. This command is new in Tru64 UNIX V5.0. The vfilepg command should be used in place of the vfile command.

/sbin/advfs/vfilepg domain_id fileset_id file_id [ page | -a ] [-f d ]/sbin/advfs/vfilepg volume_id -b block/sbin/advfs/vfilepg domain_id fileset_id file_id -d dump_file/sbin/advfs/vfilepg [-F] dump_file [ page | -a ] [-f d ]

Page Cell Volume

Bitfile metadata table 0 0 Every volume

Storage bitmap 0 1 Every volume

Root tag directory 0 2 Volume with log

Transaction log file 0 3 Volume with log

A-50 AdvFS Commands and Utilities

Page 393: Dunix Student

AdvFS Commands and Utilities

domain_id specifies an AdvFS file domain using the following format:

[-r] [-D] domain

Specify the -r flag to operate on the raw device (character device special file) of the domain instead of the block device. Specify the -D flag to force the utility to interpret the name you supply in the domain argument as a domain name.

volume_id specifies an AdvFS volume using the following format:

[-V] volume | domain_id volume_index

Specify the -V flag to force the utility to interpret the name you supply in the volume argument as a volume name. The volume name argument also can be a full or partial path for the volume, for example /dev/disk/dsk12a or dsk12a. Alternatively, specify the volume by using arguments for its domain, domain_id, and its volume index number, volume_index.

fileset_id specifies an AdvFS fileset using the following format:

[-S] fileset | -T fileset_tag

Specify the -S flag to force the command to interpret the name you supply as a fileset name. Specify the fileset by entering either the name of the fileset, fileset, or the fileset’s tag number, -T fileset_tag.

file_id specifies a file name in the following format:

file | [-t] file_tag

Specify the file by entering either the file’s pathname, file, or the file’s tag number, -t file_tag.

dump_file specifies the name of a file that contains the output from this utility. page specifies the file page number of a file.

Options

-a Specifies that all the pages in the file be displayed.

-b block Specifies the logical block number of a disk block on an AdvFS volume.

-d dump_file Specifies the name of a file that will contain the output of this utility.

-f d Specifies that the output is to be formatted in a directory hierarchy. The default, if this flag is not specified, is to format the output as a hexadecimal and ASCII dump.

AdvFS Commands and Utilities A-51

Page 394: Dunix Student

AdvFS Commands and Utilities

Operation

The vfilepg utility formats, dumps, and displays AdvFS file pages. A file page is the unit of disk storage for AdvFS file: 8 Kbytes of contiguous disk space.

The utility has the following functions:

• Format and display one file page or all the file pages of a file. The file can be in a mounted or unmounted fileset.

• Save the contents of a file in one fileset to a file in another fileset. The file written is called a dump file. The source file can be in a mounted or unmounted fileset; the output fileset must be mounted.

• Format and display a dump file that has been dumped using the utility.

• Format and display a disk block of a file. A disk block is always 512 bytes and is located by specifying its logical block number.

You can specify which file page is to be displayed (page zero is the default), or you can display all the file pages in a file. The default display of file page information is in hexadecimal and ASCII formats. If you use the -f d flag, you can specify that the data be formatted as a directory page as it is displayed.

The utility displays one 8 Kbyte file page unless you specify the -b or -a flags. In those cases, the utility displays one 512-byte disk block.

It can be misleading to use this utility with the -r flag on a domain with mounted filesets. The utility does not synchronize its read requests with AdvFS file domain read and write requests. For example, the AdvFS can be writing to the disk as the utility is reading from the disk. Therefore, metadata may not have been flushed in time for the utility to read it and consecutive reads of the same file page may return unpredictable or contradictory results. To avoid this problem, do not use the -r flag on an active domain.

The utility can fail to open a block device, even when there are no filesets mounted for the domain and the AdvFS daemon, advfsd, is running. The daemon, as it runs, activates the domain for a brief time. If the vfilepg utility fails in this situation, run it again.

-T fileset_tag Specifies the tag number for a fileset.

-t fileset_tag Specifies the tag number for a file.

volume Specifies the name of an AdvFS volume in an AdvFS file domain.

volume_index Specifies the index number of a volume in and AdvFS file domain.

A-52 AdvFS Commands and Utilities

Page 395: Dunix Student

AdvFS Commands and Utilities

vfragpgDescription

The vfragpg command displays a single header page of a frag file. In Tru64 UNIX Version 5.0, the nvfragpg command should be used in place of this command.

/sbin/advfs/vfragpg special page_LBN

special specifies the block special device name, such as /dev/rz2c. page_LBN specifies the logical block number of the page.

Operation

The vfragpg command allows you to see the structure of a single header page in a frag file.

Use showfile -x /usr/.tags/1 to locate the logical block number of the page.

You must have root user privileges to access this command.

vlogpgDescription

The vlogpg utility translates a 16-block part of a volume of an unmounted file system and formats it as a log page. In Tru64 UNIX Version 5.0, the nvlogpg command should be used in place of this command.

/sbin/advfs/vlogpg special [page_LBN]

special specifies the volume on which the log is located. page_LBN specifies the logical block number (LBN) of the volume; the default is zero.

Operation

Use this utility with other file utilities for debugging. If the volume is mounted, use the showfdmn -x command to get the extent map before calling the vlogpg command. If the volume is unmounted, use the vbmtchain command to identify the extent information that locates the log.

The vlogpg utility displays the pages and records needed to redo transactions that were in progress at the time of a crash.

You must have root user privileges to access this command.

vlsnpgDescription

The vlsnpg command displays the logical sequence number (LSN) of a page of the log. In Tru64 UNIX Version 5.0, the nvlogpg command should be used in place of this command.

/sbin/advfs/vlsnpg special [page_LBN]

AdvFS Commands and Utilities A-53

Page 396: Dunix Student

AdvFS Commands and Utilities

special specifies the volume on which the page is located. page_LBN specifies the logical block number (LBN) of the requested page; the default is zero.

Operation

Given the device and the LBN, the vlsnpg utility displays the logical sequence number of the page of the log. The page takes on the logical sequence number (LSN) of its first record. Use this command in a script to loop through logical sequence numbers for several pages to find the end of the log.

You must have root user privileges to access this command.

vrestoreDescription

The vrestore ( rvrestore) command restores files from savesets that are produced by the vdump and rvdump commands.

/sbin/vrestore -h/sbin/vrestore -V/sbin/vrestore -t [-f device]/sbin/vrestore -l [-f device]/sbin/vrestore -i [-mqv] [-f device] [-D path] [-o opt]/sbin/vrestore -x [-mqv] [-f device] [-D path] [-o opt] [file ...]/sbin/rvrestore -h/sbin/rvrestore -V/sbin/rvrestore -t [-f nodename:device]/sbin/rvrestore -l [-f nodename:device]/sbin/rvrestore -i [-mqv] [-f nodename:device] [-D path] [-o opt]/sbin/rvrestore -x [-mqv] [-f nodename:device] [-D path] [-o opt] [file ...]

Options

-h Displays usage help for the command.

-V Displays the current version for the command.

-t Lists the names and size (in bytes) of all f iles contained in a saveset. Exception: the sizes of any AdvFS quota files are not shown.

-l Lists the entire saveset structure.

-i Permits interactive restoration of files read from a saveset. See the manpage for more information.

-x Extracts a specific file or files from the saveset. Use this command as an alternate to using the add command in interactive mode. The -x flag can precede any other options, but the file ... list must be the last item on the command line.

-m Does not preserve the owner, group, or modes of each file from the device.

-q Prints only error messages; does not print information messages.

A-54 AdvFS Commands and Utilities

Page 397: Dunix Student

AdvFS Commands and Utilities

Operation

The vrestore and rvrestore commands restore data from a saveset previously archived by the vdump command or the rvdump command. The data, which can be restored from a file, a pipe, or a storage device (typically tape), is written to the specified directory. The default storage device from which files are read is /dev/tape/tape0_d1. You can use the -f flag to specify a different saveset. The vrestore and rvrestore commands restore any associated extended attributes, including ACLs, in the archive data. See the proplist(4) and acl(4) reference pages.

The vrestore and rvrestore commands are the restore facility for the AdvFS file system. However, the commands can be used to restore UFS and NFS files which have been archived by using the vdump or rvdump commands.

The default directory into which the files are restored is the current directory. You can specify an alternate directory by using the -D flag.

Use the -t flag to list the file names and sizes of the files in a saveset without restoring any files. When you are using the interactive shell and the AdvFS user and group quota files are available in the saveset for restoration, the file names used to refer to them will be quota.user and quota.group, regardless of what the quota files are named in either the backed up fileset or in the destination fileset. Restoration of the quota files does not change the names of the quota files in the destination fileset.

-v Writes the name of each file read from the storage device to the standard output device. Without this flag the vrestore command does not notify you about progress on reading from the storage device.

-f device

-f node:device

When an argument follows the -f flag, it specifies the name of the storage device that contains the saveset to be restored. The argument replaces the default device /dev/tape/tape0_d1.

For rvrestore, the specification must be in the format nodename:device to specify the remote machine name that holds the saveset to be restored.

-D path Specifies the destination path of where to restore the files. Without the -D flag, the files are restored to the current directory.

-o opt Specifies the action to take when a file already exists. The options are:

• yes, overwrite existing files without any query. This is the default.

• no, do not overwrite existing files.

• ask, ask whether to overwrite an existing file.

file... Specifies the file or files to restore when using the -x flag. All other flags must precede any file names on the command line.

AdvFS Commands and Utilities A-55

Page 398: Dunix Student

AdvFS Commands and Utilities

If the destination fileset is AdvFS, and the saveset contains AdvFS fileset quotas, the fileset quotas are restored, even when they differ from the fileset quotas of the destination fileset. By using the -o no or -o ask options, you can prevent this behavior.

The vdump and rvdump commands can write many savesets to a tape. If you want to use the vrestore or the rvrestore commands to restore a particular saveset, you must first position the tape to the saveset by using the mt command with the fsf option. For example, to position a tape that is rewound at the beginning of its second saveset, you can enter the command mt fsf 1.

The vdump and vrestore commands maintain the sparseness of AdvFS sparse files. However, sparse files that have been striped are still handled in the fashion of releases earlier than DIGITAL UNIX Version 4.0D: they are allocated disk space and filled with zeros.

You do not have to be the root user to use the vrestore command or the rvrestore command, but you must have write access to the directory to which you want to restore the files.

See rsh(8) for server and client access rules when using the rvdump or rvrestore commands.

Filesets that have been archived by using the vdump or rvdump commands must be restored by using the vrestore or rvrestore commands. The vdump and rvdump commands are not interchangeable with the dump and rdump commands. Similarly, the vrestore and rvrestore commands are not interchangeable with the restore and rrestore commands.

Only the root user can restore AdvFS quota files and fileset quotas. A warning message is displayed when a non-root user attempts to use the vrestore command to restore AdvFS quota files or fileset quotas. The vrestore command in DIGITAL UNIX versions earlier than Version 4.0 cannot be used to restore savesets produced by the vdump command in DIGITAL UNIX Version 4.0 or higher systems.

AdvFS quota files can be restored to either an AdvFS fileset or a UFS file system, but UFS quota files cannot be restored to an AdvFS fileset. If AdvFS quota files are to be restored to a UFS file system, quotas must be enabled on the UFS file system. Otherwise, the operation fails.

AdvFS fileset quotas cannot be restored to an UFS file system because there is no UFS analog to AdvFS fileset quotas. Attempting to do a vrestore or rvrestore to a base directory that has a default ACL or a default access ACL may cause unintended ACLs to be created on the restored files and directories. If ACLs are enabled on the system, check all ACLs after the vrestore or rvrestore.

A-56 AdvFS Commands and Utilities

Page 399: Dunix Student

AdvFS Commands and Utilities

e.

ify main

a full

.

vsbmpgDescription

The vsbmpg command displays a page from a storage bitmap (SBM) file. This command is new in Tru64 UNIX V5.0.

/sbin/advfs/vsbmpg [-v] sbm_id | domain_id/sbin/advfs/vsbmpg sbm_id page [entry]/sbin/advfs/vsbmpg sbm_id -a/sbin/advfs/vsbmpg smb_id -i index/sbin/advfs/vsbmpg sbm_id -B block/sbin/advfs/vsbmpg volume_id -b block/sbin/advfs/vsbmpg volume_id -d dump_file

sbm_id specifies an SBM file using the following format:

volume_id | [-F] dump_file

The dump_file parameter is a previously-saved copy of the fileset’s SBM filYou can use the -F flag to force the utility to interpret the dump_file parameter as a file name if it has the same name as a domain name.

domain_id specifies an AdvFS file domain using the following format:

[-r] [-D] domain

By default, the utility opens all volumes using block device special files. Specthe -r flag to operate on the raw device (character device special file) of the doinstead of the block device. Specify the -D flag to force the utility to interpret thename you supply in the domain argument as a domain name.

volume_id specifies an AdvFS volume using the following format:

[-V] volume | domain_id volume_index

Specify the -V flag to force the utility to interpret the name you supply in the volume argument as a volume name. The volume name argument also can beor partial path for the volume, for example /dev/disk/dsk12a or dsk12a. Specifying a partial path name always opens the character device special file

Alternatively, specify the volume by using arguments for its domain, domain_id, and its volume index number, volume_index.

page specifies the file page number of the SBM file.

entry specifies the index of the SBM word on the page.

AdvFS Commands and Utilities A-57

Page 400: Dunix Student

AdvFS Commands and Utilities

Options

Operation

Storage bitmaps (SBMs) are used to track free and allocated disk space of AdvFS volumes. Each volume in an AdvFS domain has one SBM file. The vsbmpg utility displays pages of a SBM file.

Using the vsbmpg command you can:

• Display SBM page summaries

• Display an SBM file page

• Display one SBM entry

• Display corrupted volumes

• Save or display an SBM file

For more information, see the vsbmpg reference page.

An active domain, which is a domain with one or more of its filesets mounted, has all of its volumes opened using block device special files. These devices cannot be opened a second time without first being unmounted. However, the character device special files for the volumes can be opened more than once while still mounted.

It can be misleading to use this utility on a domain with mounted filesets because the utility does not synchronize its read requests with AdvFS file domain read and write requests.

For example, AdvFS can be writing to the disk as the utility is reading from the disk. Therefore, when you run the utility, metadata may not have been flushed in time for the utility to read it and consecutive reads of the same file page may return unpredictable or contradictory results. [The domain is not harmed.]

To avoid this problem, unmount all the filesets in the domain before using this utility.

-a Displays all the pages of the SBM file.

-B block Displays the portion of the SBM that maps the specified block.

-b block Specifies a starting block for the part of an AdvFS volume that you want to format as an SBM page.

-d dump_file Specifies the name of a file that contains the output of this utility.

-i index Displays the SBM word specified by the index.

-v Checks the checksum on each page of the SBM.

A-58 AdvFS Commands and Utilities

Page 401: Dunix Student

AdvFS Commands and Utilities

The utility can fail to open a block device, even when there are no filesets mounted for the domain and the AdvFS daemon, advfsd, is running. The daemon, as it runs, activates the domain for a brief time. If the vsbmpg utility fails in this situation, run it again.

You must have root user privilege to use this command.

vtagpgDescription

The vtagpg utility displays a formatted page of a tag file. In Tru64 UNIX Version 5.0, the nvtagpg command should be used in place of this command.

/sbin/advfs/vtagpg special [page_LBN]

special specifies the volume on which the tag file is located.

page_LBN specifies the logical block number (LBN) of the page to be examined; the default is zero.

Operation

The vtagpg utility formats a page of the disk as a tag file page. Use this utility with other file utilities to locate file-structure anomalies for debugging.

If the volume is mounted, use the showfile -x command to get the extent map before calling the vtagpg command. If the volume is unmounted, call the vbmtchain command to identify the extent information.

Run the vtagpg utility to obtain the root tag file first because it has entries for each fileset in the domain. Then, run the utility again to view the tag file for the fileset under investigation. This tag information points to the fileset metadata.

The vtagpg utility displays tag entries that map a file tag number to a primary mcell location.

You must be the root user to use this command.

AdvFS Commands and Utilities A-59

Page 402: Dunix Student

AdvFS Commands and Utilities

A-60 AdvFS Commands and Utilities

Page 403: Dunix Student

Index

Aaddvol A-3AdvFS

architecture 1-22, 1-23directories and migration 2-36file addresses 2-8in-memory structures overview 3-3POSIX files 2-35two-level implementation 2-3, 2-4UNIX directories 2-35

AdvFS commandaddvol A-3advfsstat A-6advscan A-7balance A-9chfile A-10chfsets A-12chvol A-13defragment A-14logread A-16migrate A-16mkfdmn A-18mkfset A-20mountlist A-21msfsck A-21ncheck A-21nvbmtpg A-22nvfragpg A-25nvlogpg A-27nvtagpg A-29rmfdmn A-32rmfset A-32rmvol A-33salvage 5-50, A-34shblk A-37shfragbf A-37showfdmn A-38showfile A-39showfsets A-41stripe A-42switchlog A-42tag2name A-43vbmtchain A-44vbmtpg A-44vchkdir A-45vdf A-45vdump A-46verify A-48version comparison table A-3vfile A-50vfilepg A-50

vfragpg A-53vlogpg A-53vlsnpg A-53vods A-54vrestore A-54vsbmpg A-57vtagpg A-59

AdvFS corruptioncauses 5-8recognizing 5-8

AdvFS entry pointdevice driver callback 4-6I/O completion function 4-7lightweight context interface 4-7UBC interface 4-6VFS switch table 4-4vnode switch table 4-5

AdvFS recovery pass 4-15AdvFS startup

activating domain table entry 4-14activating the bitfile-set 4-14activating the domain 4-14mounting the file system 4-13recovering a domain 4-14

AdvFS system calldomains and volumes 4-10example 4-25filesets 4-11true 4-8types 4-8

AdvFS threadfragment bitfile 4-23FS cleanup 4-24I/O 4-24overview 4-23

AdvFS troubleshootingBMT exhaustion 5-16corruption and system panic case study 5-34determining log size 5-14domain corruption case study 5-20domain panic 5-12fragment free list corruption case study 5-30generalized corruption 5-11localized corruption 5-10log half-full problem 5-14mount file system crashes system 5-9no valid file system error 5-9tips and practices 5-4

AdvFS volume 1-6advfsstat A-6advscan A-7

Index-1

Page 404: Dunix Student

Bbalance A-9BAS

access to structures 3-19in-memory structures 3-18storage allocation 4-16

bfAccess structurefinding 3-19managing 3-19

bfnode structure 3-9bfSet structure

finding 3-20bitfile

BAS on-disk metadata 2-10buffer descriptor 3-27definition 2-7migrating 4-22misc 2-47page references 3-28per bitfile-set 2-13per volume 2-12root tag directory 2-31SBM (storage bitmap) 2-43truncating 4-17

bitfile-set 2-9tag directory 2-31

BMTextents 2-21page 1 2-21page format 2-21

Cchfile A-10chfsets A-12chvol A-13clone

closing a deleted bitfile 4-20creating 4-18deleting a bitfile 4-20deleting bitfile from cloned original 4-20issues with 1-16reading from 4-19using clonefset command 1-15writing to a cloned original 4-19

Ddefragment A-14domain structure

finding 3-21

Eextent

definition 1-8, 1-9, 1-10displaying using showfile command 1-9

encoding 2-25extent based storage 1-8primary extent map record 2-27

extent mapsnonreserved files 2-25reserved files 2-25

FFAS

in-memory structures 3-9storage allocation 4-16

file domain 1-3, 1-4, 1-5file structures, in-memory 3-9fileset

definition 1-3, 1-4, 1-5deleting 4-22in-memory structures 3-13quota structures 3-16

fileSetNode structure 3-12fragment

bitfile 2-37groups 2-37header 2-38utilities 2-39

fragments and files 2-41free space cache 3-27fsContext structure 3-11FTX state structure 3-28

II/O descriptor 3-28

Llogging

a transaction 1-13process definition 1-13

logread A-16

Mmcell

addresses 2-22format 2-23overview 2-20page structure 2-20records 2-19, 2-23reserved addresses 2-22

migrate A-16mkfdmn A-18mkfset A-20mount structure 3-7mountlist A-21msfsck A-21

Index-2

Page 405: Dunix Student

Nncheck A-21nvbmtpg A-22nvfragpg A-25nvlogpg A-27nvtagpg A-29

Rrmfdmn A-32rmfset A-32rmvol A-33

Ssalvage A-34

definition 5-50example 5-52massive metadata corruption 5-54using with backup media 5-53when to use 5-53without backup media 5-54

SBM format 2-44shblk A-37shfragbf A-37showfdmn A-38showfile A-39showfsets A-41stripe A-42striping

file 1-17switchlog A-42

Ttag2name A-43tags

directories 2-27directory 2-10directory page 2-29metadata bitfile 2-16reusing 2-9tagmap entries 2-30utility for viewing tag directory 2-33

trash cans 1-19

Uuser/group quota structures 3-17

Vvbmtpg A-44vchkdir A-45vdf A-45vdump A-46verify A-48vfile A-50

vfilepg A-50vfragpg A-53vlogpg A-53vlsnpg A-53vnode structure 3-7vods A-54vrestore A-54vsbmpg A-57vtagpg A-59

Index-3

Page 406: Dunix Student

Index-4