Found wdiff, but it reported no recognisable version. Falling back to builtin diff colouring... --- 1/draft-ch-18-pre-backchannel_ctl.txt 2008-04-30 21:07:15.983071900 -0700 +++ 2/draft-ietf-nfsv4-minorversion1-22.txt 2008-04-30 21:07:16.393691000 -0700 @@ -1,16 +1,16 @@ NFSv4 S. Shepler Internet-Draft M. Eisler Intended status: Standards Track D. Noveck -Expires: October 19, 2008 Editors - April 17, 2008 +Expires: November 1, 2008 Editors + April 30, 2008 NFS Version 4 Minor Version 1 draft-ietf-nfsv4-minorversion1-22.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. @@ -24,21 +24,21 @@ and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. - This Internet-Draft will expire on October 19, 2008. + This Internet-Draft will expire on November 1, 2008. Copyright Notice Copyright (C) The IETF Trust (2008). Abstract This Internet-Draft describes NFS version 4 minor version one, including features retained from the base protocol and protocol extensions made subsequently. Major extensions introduced in NFS @@ -268,193 +268,193 @@ 12.2.9. Layout Iomode . . . . . . . . . . . . . . . . . . . 269 12.2.10. Device IDs . . . . . . . . . . . . . . . . . . . . . 270 12.3. pNFS Operations . . . . . . . . . . . . . . . . . . . . 271 12.4. pNFS Attributes . . . . . . . . . . . . . . . . . . . . 272 12.5. Layout Semantics . . . . . . . . . . . . . . . . . . . . 272 12.5.1. Guarantees Provided by Layouts . . . . . . . . . . . 272 12.5.2. Getting a Layout . . . . . . . . . . . . . . . . . . 273 12.5.3. Layout Stateid . . . . . . . . . . . . . . . . . . . 274 12.5.4. Committing a Layout . . . . . . . . . . . . . . . . 275 12.5.5. Recalling a Layout . . . . . . . . . . . . . . . . . 279 - 12.5.6. Revoking Layouts . . . . . . . . . . . . . . . . . . 286 - 12.5.7. Metadata Server Write Propagation . . . . . . . . . 286 + 12.5.6. Revoking Layouts . . . . . . . . . . . . . . . . . . 287 + 12.5.7. Metadata Server Write Propagation . . . . . . . . . 287 12.6. pNFS Mechanics . . . . . . . . . . . . . . . . . . . . . 287 - 12.7. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 288 - 12.7.1. Recovery from Client Restart . . . . . . . . . . . . 288 - 12.7.2. Dealing with Lease Expiration on the Client . . . . 289 + 12.7. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 289 + 12.7.1. Recovery from Client Restart . . . . . . . . . . . . 289 + 12.7.2. Dealing with Lease Expiration on the Client . . . . 290 12.7.3. Dealing with Loss of Layout State on the Metadata - Server . . . . . . . . . . . . . . . . . . . . . . . 290 - 12.7.4. Recovery from Metadata Server Restart . . . . . . . 290 - 12.7.5. Operations During Metadata Server Grace Period . . . 292 - 12.7.6. Storage Device Recovery . . . . . . . . . . . . . . 293 - 12.8. Metadata and Storage Device Roles . . . . . . . . . . . 293 + Server . . . . . . . . . . . . . . . . . . . . . . . 291 + 12.7.4. Recovery from Metadata Server Restart . . . . . . . 291 + 12.7.5. Operations During Metadata Server Grace Period . . . 293 + 12.7.6. Storage Device Recovery . . . . . . . . . . . . . . 294 + 12.8. Metadata and Storage Device Roles . . . . . . . . . . . 294 12.9. Security Considerations for pNFS . . . . . . . . . . . . 294 13. PNFS: NFSv4.1 File Layout Type . . . . . . . . . . . . . . . 295 - 13.1. Client ID and Session Considerations . . . . . . . . . . 295 - 13.1.1. Sessions Considerations for Data Servers . . . . . . 297 + 13.1. Client ID and Session Considerations . . . . . . . . . . 296 + 13.1.1. Sessions Considerations for Data Servers . . . . . . 298 13.2. File Layout Definitions . . . . . . . . . . . . . . . . 298 - 13.3. File Layout Data Types . . . . . . . . . . . . . . . . . 298 - 13.4. Interpreting the File Layout . . . . . . . . . . . . . . 302 - 13.4.1. Determining the Stripe Unit Number . . . . . . . . . 302 - 13.4.2. Interpreting the File Layout Using Sparse Packing . 302 - 13.4.3. Interpreting the File Layout Using Dense Packing . . 305 - 13.4.4. Sparse and Dense Stripe Unit Packing . . . . . . . . 307 - 13.5. Data Server Multipathing . . . . . . . . . . . . . . . . 309 - 13.6. Operations Sent to NFSv4.1 Data Servers . . . . . . . . 310 - 13.7. COMMIT Through Metadata Server . . . . . . . . . . . . . 312 - 13.8. The Layout Iomode . . . . . . . . . . . . . . . . . . . 314 - 13.9. Metadata and Data Server State Coordination . . . . . . 314 - 13.9.1. Global Stateid Requirements . . . . . . . . . . . . 314 - 13.9.2. Data Server State Propagation . . . . . . . . . . . 315 - 13.10. Data Server Component File Size . . . . . . . . . . . . 317 - 13.11. Layout Revocation and Fencing . . . . . . . . . . . . . 318 - 13.12. Security Considerations for the File Layout Type . . . . 318 - 14. Internationalization . . . . . . . . . . . . . . . . . . . . 319 - 14.1. Stringprep profile for the utf8str_cs type . . . . . . . 320 - 14.2. Stringprep profile for the utf8str_cis type . . . . . . 322 - 14.3. Stringprep profile for the utf8str_mixed type . . . . . 323 - 14.4. UTF-8 Capabilities . . . . . . . . . . . . . . . . . . . 325 - 14.5. UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 325 - 15. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 326 - 15.1. Error Definitions . . . . . . . . . . . . . . . . . . . 326 - 15.1.1. General Errors . . . . . . . . . . . . . . . . . . . 328 - 15.1.2. Filehandle Errors . . . . . . . . . . . . . . . . . 330 - 15.1.3. Compound Structure Errors . . . . . . . . . . . . . 331 - 15.1.4. File System Errors . . . . . . . . . . . . . . . . . 333 - 15.1.5. State Management Errors . . . . . . . . . . . . . . 335 - 15.1.6. Security Errors . . . . . . . . . . . . . . . . . . 336 - 15.1.7. Name Errors . . . . . . . . . . . . . . . . . . . . 336 - 15.1.8. Locking Errors . . . . . . . . . . . . . . . . . . . 337 - 15.1.9. Reclaim Errors . . . . . . . . . . . . . . . . . . . 338 - 15.1.10. pNFS Errors . . . . . . . . . . . . . . . . . . . . 339 - 15.1.11. Session Use Errors . . . . . . . . . . . . . . . . . 340 - 15.1.12. Session Management Errors . . . . . . . . . . . . . 342 - 15.1.13. Client Management Errors . . . . . . . . . . . . . . 342 - 15.1.14. Delegation Errors . . . . . . . . . . . . . . . . . 343 - 15.1.15. Attribute Handling Errors . . . . . . . . . . . . . 343 - 15.1.16. Obsoleted Errors . . . . . . . . . . . . . . . . . . 344 - 15.2. Operations and their valid errors . . . . . . . . . . . 345 - 15.3. Callback operations and their valid errors . . . . . . . 361 - 15.4. Errors and the operations that use them . . . . . . . . 363 - 16. NFSv4.1 Procedures . . . . . . . . . . . . . . . . . . . . . 377 - 16.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 377 - 16.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 378 - 17. Operations: REQUIRED, RECOMMENDED, or OPTIONAL . . . . . . . 389 - 18. NFSv4.1 Operations . . . . . . . . . . . . . . . . . . . . . 392 - 18.1. Operation 3: ACCESS - Check Access Rights . . . . . . . 392 - 18.2. Operation 4: CLOSE - Close File . . . . . . . . . . . . 398 - 18.3. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 399 - 18.4. Operation 6: CREATE - Create a Non-Regular File Object . 402 + 13.3. File Layout Data Types . . . . . . . . . . . . . . . . . 299 + 13.4. Interpreting the File Layout . . . . . . . . . . . . . . 303 + 13.4.1. Determining the Stripe Unit Number . . . . . . . . . 303 + 13.4.2. Interpreting the File Layout Using Sparse Packing . 303 + 13.4.3. Interpreting the File Layout Using Dense Packing . . 306 + 13.4.4. Sparse and Dense Stripe Unit Packing . . . . . . . . 308 + 13.5. Data Server Multipathing . . . . . . . . . . . . . . . . 310 + 13.6. Operations Sent to NFSv4.1 Data Servers . . . . . . . . 311 + 13.7. COMMIT Through Metadata Server . . . . . . . . . . . . . 313 + 13.8. The Layout Iomode . . . . . . . . . . . . . . . . . . . 315 + 13.9. Metadata and Data Server State Coordination . . . . . . 315 + 13.9.1. Global Stateid Requirements . . . . . . . . . . . . 315 + 13.9.2. Data Server State Propagation . . . . . . . . . . . 316 + 13.10. Data Server Component File Size . . . . . . . . . . . . 318 + 13.11. Layout Revocation and Fencing . . . . . . . . . . . . . 319 + 13.12. Security Considerations for the File Layout Type . . . . 319 + 14. Internationalization . . . . . . . . . . . . . . . . . . . . 320 + 14.1. Stringprep profile for the utf8str_cs type . . . . . . . 321 + 14.2. Stringprep profile for the utf8str_cis type . . . . . . 323 + 14.3. Stringprep profile for the utf8str_mixed type . . . . . 324 + 14.4. UTF-8 Capabilities . . . . . . . . . . . . . . . . . . . 326 + 14.5. UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 326 + 15. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 327 + 15.1. Error Definitions . . . . . . . . . . . . . . . . . . . 327 + 15.1.1. General Errors . . . . . . . . . . . . . . . . . . . 329 + 15.1.2. Filehandle Errors . . . . . . . . . . . . . . . . . 331 + 15.1.3. Compound Structure Errors . . . . . . . . . . . . . 332 + 15.1.4. File System Errors . . . . . . . . . . . . . . . . . 334 + 15.1.5. State Management Errors . . . . . . . . . . . . . . 336 + 15.1.6. Security Errors . . . . . . . . . . . . . . . . . . 337 + 15.1.7. Name Errors . . . . . . . . . . . . . . . . . . . . 337 + 15.1.8. Locking Errors . . . . . . . . . . . . . . . . . . . 338 + 15.1.9. Reclaim Errors . . . . . . . . . . . . . . . . . . . 339 + 15.1.10. pNFS Errors . . . . . . . . . . . . . . . . . . . . 340 + 15.1.11. Session Use Errors . . . . . . . . . . . . . . . . . 341 + 15.1.12. Session Management Errors . . . . . . . . . . . . . 343 + 15.1.13. Client Management Errors . . . . . . . . . . . . . . 343 + 15.1.14. Delegation Errors . . . . . . . . . . . . . . . . . 344 + 15.1.15. Attribute Handling Errors . . . . . . . . . . . . . 344 + 15.1.16. Obsoleted Errors . . . . . . . . . . . . . . . . . . 345 + 15.2. Operations and their valid errors . . . . . . . . . . . 346 + 15.3. Callback operations and their valid errors . . . . . . . 362 + 15.4. Errors and the operations that use them . . . . . . . . 364 + 16. NFSv4.1 Procedures . . . . . . . . . . . . . . . . . . . . . 378 + 16.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 378 + 16.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 379 + 17. Operations: REQUIRED, RECOMMENDED, or OPTIONAL . . . . . . . 390 + 18. NFSv4.1 Operations . . . . . . . . . . . . . . . . . . . . . 393 + 18.1. Operation 3: ACCESS - Check Access Rights . . . . . . . 393 + 18.2. Operation 4: CLOSE - Close File . . . . . . . . . . . . 399 + 18.3. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 400 + 18.4. Operation 6: CREATE - Create a Non-Regular File Object . 403 18.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting - Recovery . . . . . . . . . . . . . . . . . . . . . . . . 405 - 18.6. Operation 8: DELEGRETURN - Return Delegation . . . . . . 406 - 18.7. Operation 9: GETATTR - Get Attributes . . . . . . . . . 406 - 18.8. Operation 10: GETFH - Get Current Filehandle . . . . . . 408 - 18.9. Operation 11: LINK - Create Link to a File . . . . . . . 409 - 18.10. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 412 - 18.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 416 - 18.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 417 - 18.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 419 - 18.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 420 + Recovery . . . . . . . . . . . . . . . . . . . . . . . . 406 + 18.6. Operation 8: DELEGRETURN - Return Delegation . . . . . . 407 + 18.7. Operation 9: GETATTR - Get Attributes . . . . . . . . . 407 + 18.8. Operation 10: GETFH - Get Current Filehandle . . . . . . 409 + 18.9. Operation 11: LINK - Create Link to a File . . . . . . . 410 + 18.10. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 413 + 18.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 417 + 18.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 418 + 18.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 420 + 18.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 421 18.15. Operation 17: NVERIFY - Verify Difference in - Attributes . . . . . . . . . . . . . . . . . . . . . . . 422 - 18.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 423 + Attributes . . . . . . . . . . . . . . . . . . . . . . . 423 + 18.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 424 18.17. Operation 19: OPENATTR - Open Named Attribute - Directory . . . . . . . . . . . . . . . . . . . . . . . 442 - 18.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 443 - 18.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 445 - 18.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 445 - 18.21. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 447 - 18.22. Operation 25: READ - Read from File . . . . . . . . . . 448 - 18.23. Operation 26: READDIR - Read Directory . . . . . . . . . 450 - 18.24. Operation 27: READLINK - Read Symbolic Link . . . . . . 454 - 18.25. Operation 28: REMOVE - Remove File System Object . . . . 455 - 18.26. Operation 29: RENAME - Rename Directory Entry . . . . . 457 - 18.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 461 - 18.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 462 - 18.29. Operation 33: SECINFO - Obtain Available Security . . . 463 - 18.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 467 - 18.31. Operation 37: VERIFY - Verify Same Attributes . . . . . 470 - 18.32. Operation 38: WRITE - Write to File . . . . . . . . . . 471 - 18.33. Operation 40: BACKCHANNEL_CTL - Backchannel control . . 475 - 18.34. Operation 41: BIND_CONN_TO_SESSION . . . . . . . . . . . 477 - 18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 479 + Directory . . . . . . . . . . . . . . . . . . . . . . . 443 + 18.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 444 + 18.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 446 + 18.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 446 + 18.21. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 448 + 18.22. Operation 25: READ - Read from File . . . . . . . . . . 449 + 18.23. Operation 26: READDIR - Read Directory . . . . . . . . . 451 + 18.24. Operation 27: READLINK - Read Symbolic Link . . . . . . 455 + 18.25. Operation 28: REMOVE - Remove File System Object . . . . 456 + 18.26. Operation 29: RENAME - Rename Directory Entry . . . . . 458 + 18.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 462 + 18.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 463 + 18.29. Operation 33: SECINFO - Obtain Available Security . . . 464 + 18.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 468 + 18.31. Operation 37: VERIFY - Verify Same Attributes . . . . . 471 + 18.32. Operation 38: WRITE - Write to File . . . . . . . . . . 472 + 18.33. Operation 40: BACKCHANNEL_CTL - Backchannel control . . 476 + 18.34. Operation 41: BIND_CONN_TO_SESSION . . . . . . . . . . . 478 + 18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 481 18.36. Operation 43: CREATE_SESSION - Create New Session and - Confirm Client ID . . . . . . . . . . . . . . . . . . . 495 + Confirm Client ID . . . . . . . . . . . . . . . . . . . 498 18.37. Operation 44: DESTROY_SESSION - Destroy existing - session . . . . . . . . . . . . . . . . . . . . . . . . 505 + session . . . . . . . . . . . . . . . . . . . . . . . . 508 18.38. Operation 45: FREE_STATEID - Free stateid with no - locks . . . . . . . . . . . . . . . . . . . . . . . . . 507 + locks . . . . . . . . . . . . . . . . . . . . . . . . . 509 18.39. Operation 46: GET_DIR_DELEGATION - Get a directory - delegation . . . . . . . . . . . . . . . . . . . . . . . 508 - 18.40. Operation 47: GETDEVICEINFO - Get Device Information . . 512 + delegation . . . . . . . . . . . . . . . . . . . . . . . 510 + 18.40. Operation 47: GETDEVICEINFO - Get Device Information . . 514 18.41. Operation 48: GETDEVICELIST - Get All Device Mappings - for a File System . . . . . . . . . . . . . . . . . . . 514 + for a File System . . . . . . . . . . . . . . . . . . . 516 18.42. Operation 49: LAYOUTCOMMIT - Commit writes made using - a layout . . . . . . . . . . . . . . . . . . . . . . . . 516 - 18.43. Operation 50: LAYOUTGET - Get Layout Information . . . . 519 + a layout . . . . . . . . . . . . . . . . . . . . . . . . 518 + 18.43. Operation 50: LAYOUTGET - Get Layout Information . . . . 521 18.44. Operation 51: LAYOUTRETURN - Release Layout - Information . . . . . . . . . . . . . . . . . . . . . . 523 + Information . . . . . . . . . . . . . . . . . . . . . . 526 18.45. Operation 52: SECINFO_NO_NAME - Get Security on - Unnamed Object . . . . . . . . . . . . . . . . . . . . . 528 + Unnamed Object . . . . . . . . . . . . . . . . . . . . . 530 18.46. Operation 53: SEQUENCE - Supply per-procedure - sequencing and control . . . . . . . . . . . . . . . . . 529 - 18.47. Operation 54: SET_SSV - Update SSV for a Client ID . . . 535 + sequencing and control . . . . . . . . . . . . . . . . . 531 + 18.47. Operation 54: SET_SSV - Update SSV for a Client ID . . . 537 18.48. Operation 55: TEST_STATEID - Test stateids for - validity . . . . . . . . . . . . . . . . . . . . . . . . 537 - 18.49. Operation 56: WANT_DELEGATION - Request Delegation . . . 539 + validity . . . . . . . . . . . . . . . . . . . . . . . . 539 + 18.49. Operation 56: WANT_DELEGATION - Request Delegation . . . 541 18.50. Operation 57: DESTROY_CLIENTID - Destroy existing - client ID . . . . . . . . . . . . . . . . . . . . . . . 542 + client ID . . . . . . . . . . . . . . . . . . . . . . . 544 18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims - Finished . . . . . . . . . . . . . . . . . . . . . . . . 542 - 18.52. Operation 10044: ILLEGAL - Illegal operation . . . . . . 545 - 19. NFSv4.1 Callback Procedures . . . . . . . . . . . . . . . . . 545 - 19.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 546 - 19.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 546 - 20. NFSv4.1 Callback Operations . . . . . . . . . . . . . . . . . 550 - 20.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . . . 550 - 20.2. Operation 4: CB_RECALL - Recall an Open Delegation . . . 551 + Finished . . . . . . . . . . . . . . . . . . . . . . . . 545 + 18.52. Operation 10044: ILLEGAL - Illegal operation . . . . . . 547 + 19. NFSv4.1 Callback Procedures . . . . . . . . . . . . . . . . . 547 + 19.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 548 + 19.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 548 + 20. NFSv4.1 Callback Operations . . . . . . . . . . . . . . . . . 552 + 20.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . . . 552 + 20.2. Operation 4: CB_RECALL - Recall an Open Delegation . . . 553 20.3. Operation 5: CB_LAYOUTRECALL - Recall Layout from - Client . . . . . . . . . . . . . . . . . . . . . . . . . 552 - 20.4. Operation 6: CB_NOTIFY - Notify directory changes . . . 556 + Client . . . . . . . . . . . . . . . . . . . . . . . . . 554 + 20.4. Operation 6: CB_NOTIFY - Notify directory changes . . . 558 20.5. Operation 7: CB_PUSH_DELEG - Offer Delegation to - Client . . . . . . . . . . . . . . . . . . . . . . . . . 560 - 20.6. Operation 8: CB_RECALL_ANY - Keep any N delegations . . 561 + Client . . . . . . . . . . . . . . . . . . . . . . . . . 562 + 20.6. Operation 8: CB_RECALL_ANY - Keep any N delegations . . 563 20.7. Operation 9: CB_RECALLABLE_OBJ_AVAIL - Signal - Resources for Recallable Objects . . . . . . . . . . . . 563 + Resources for Recallable Objects . . . . . . . . . . . . 565 20.8. Operation 10: CB_RECALL_SLOT - change flow control - limits . . . . . . . . . . . . . . . . . . . . . . . . . 564 + limits . . . . . . . . . . . . . . . . . . . . . . . . . 566 20.9. Operation 11: CB_SEQUENCE - Supply backchannel - sequencing and control . . . . . . . . . . . . . . . . . 565 + sequencing and control . . . . . . . . . . . . . . . . . 567 20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending - Delegation Wants . . . . . . . . . . . . . . . . . . . . 567 + Delegation Wants . . . . . . . . . . . . . . . . . . . . 569 20.11. Operation 13: CB_NOTIFY_LOCK - Notify of possible - lock availability . . . . . . . . . . . . . . . . . . . 568 + lock availability . . . . . . . . . . . . . . . . . . . 570 20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify device ID - changes . . . . . . . . . . . . . . . . . . . . . . . . 570 + changes . . . . . . . . . . . . . . . . . . . . . . . . 572 20.13. Operation 10044: CB_ILLEGAL - Illegal Callback - Operation . . . . . . . . . . . . . . . . . . . . . . . 572 - 21. Security Considerations . . . . . . . . . . . . . . . . . . . 572 - 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 574 - 22.1. Named Attribute Definitions . . . . . . . . . . . . . . 574 - 22.2. ONC RPC Network Identifiers (netids) . . . . . . . . . . 574 - 22.3. Defining New Notifications . . . . . . . . . . . . . . . 575 - 22.4. Defining New Layout Types . . . . . . . . . . . . . . . 575 - 22.5. Path Variable Definitions . . . . . . . . . . . . . . . 577 - 22.5.1. Path Variable Values . . . . . . . . . . . . . . . . 577 - 22.5.2. Path Variable Names . . . . . . . . . . . . . . . . 577 - 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 577 - 23.1. Normative References . . . . . . . . . . . . . . . . . . 577 - 23.2. Informative References . . . . . . . . . . . . . . . . . 579 - Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 581 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 583 - Intellectual Property and Copyright Statements . . . . . . . . . 584 + Operation . . . . . . . . . . . . . . . . . . . . . . . 574 + 21. Security Considerations . . . . . . . . . . . . . . . . . . . 574 + 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 576 + 22.1. Named Attribute Definitions . . . . . . . . . . . . . . 576 + 22.2. ONC RPC Network Identifiers (netids) . . . . . . . . . . 576 + 22.3. Defining New Notifications . . . . . . . . . . . . . . . 577 + 22.4. Defining New Layout Types . . . . . . . . . . . . . . . 577 + 22.5. Path Variable Definitions . . . . . . . . . . . . . . . 579 + 22.5.1. Path Variable Values . . . . . . . . . . . . . . . . 579 + 22.5.2. Path Variable Names . . . . . . . . . . . . . . . . 579 + 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 579 + 23.1. Normative References . . . . . . . . . . . . . . . . . . 579 + 23.2. Informative References . . . . . . . . . . . . . . . . . 581 + Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 583 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 585 + Intellectual Property and Copyright Statements . . . . . . . . . 586 1. Introduction 1.1. The NFS Version 4 Minor Version 1 Protocol The NFS version 4 minor version 1 (NFSv4.1) protocol is the second minor version of the NFS version 4 (NFSv4) protocol. The first minor version, NFSv4.0 is described in [21]. It generally follows the guidelines for minor versioning model listed in Section 10 of RFC 3530. However, it diverges from guidelines 11 ("a client and server @@ -13410,39 +13410,71 @@ CB_LAYOUTRECALL, so it returns NFS4ERR_RECALLCONFLICT. 3. The client sent the LAYOUTGET after processing the CB_LAYOUTRECALL, the server received the CB_LAYOUTRECALL response, but the LAYOUTGET arrived before the LAYOUTRETURN that completed that processing. The "seqid" in the layout stateid of LAYOUTGET is equal to that of the "seqid" in CB_LAYOUTRECALL. The server has received a response to the CB_LAYOUTRECALL, so it returns NFS4ERR_RETURNCONFLICT. -12.5.5.2.1.4. Wraparound of sequence id +12.5.5.2.1.4. Wraparound and Validation of Seqid The rules for layout stateid processing differ from other stateids in the protocol because the "seqid" value can not be zero and the stateid's "seqid" value changes in a CB_LAYOUTRECALL operation. The non-zero requirement combined with the inherent parallelism of layout operations means that a set of LAYOUTGET and LAYOUTRETURN operations - may contain the same value for "seqid" and the value will represent - the span of parallelism achieved by the client. To account for this - parallelism, the server validates that the "seqid" is non-zero. If - the server uses a CB_LAYOUTRECALL, then the "seqid" is validated - further by applying the rules listed above in Section 12.5.5.2.1.3. + may contain the same value for "seqid". The server uses a slightly + modified version of the modulo arithmetic as described in + Section 2.10.5.1 when incrementing the layout stateid's "seqid". The + modification to that modulo arithmetic description is to not use + zero. The modulo arithmetic is also used for the comparisons of + "seqid" values in the processing of CB_LAYOUTRECALL events as + described above in Section 12.5.5.2.1.3. - The server uses a slightly modified version of the modulo arithmetic - as described in Section 2.10.5.1 when incrementing the layout - stateid's "seqid". The modification to that modulo arithmetic - description is to not use zero. The modulo arithmetic is also used - for the comparisons of "seqid" values in the processing of - CB_LAYOUTRECALL events as described above in Section 12.5.5.2.1.3. + Just as the server validates the "seqid" in the event of + CB_LAYOUTRECALL usage, as described in Section 12.5.5.2.1.3, the + server also validates the "seqid" value to ensure that it is within + an appropriate range. This range represents the degree of + parallelism the server supports for layout stateids. If the client + is sending multiple layout operations to the server in parallel, by + definition, the "seqid" value in the supplied stateid will not be the + current "seqid" as held by the server. The range of parallelism + spans from the highest or current "seqid" to a "seqid" value in the + past. To assist in the discussion, the server's current "seqid" + value for a layout stateid is defined as: SERVER_CURRENT_SEQID. The + lowest "seqid" value that is acceptable to the server is represented + by PAST_SEQID. And the value for the range of valid "seqid"s or + range of parallelism is VALID_SEQID_RANGE. Therefore, the following + holds: VALID_SEQID_RANGE = SERVER_CURRENT_SEQID - PAST_SEQID. In the + following, all arithmetic is the modulo arithmetic as described + above. + + The server MUST support a minimum VALID_SEQID_RANGE. The minimum is + defined as: VALID_SEQID_RANGE = summation of 1..N of + (ca_maxoperations(i) - 1) where N is the number of session fore + channels and ca_maxoperations(i) is the value of the ca_maxoperations + returned from CREATE_SESSION of the i'th session. The reason for + minus 1 is to allow for the required SEQUENCE operation. The server + MAY support a VALID_SEQID_RANGE value larger than the minimum. The + maximum VALID_SEQID_RANGE is (2 ^ 32 - 2) (accounts for 0 not being a + valid "seqid" value). + + If the server finds the "seqid" is zero, the NFS4ERR_BAD_STATEID + error is returned to the client. The server further validates the + "seqid" to ensure it is within the range of parallelism, + VALID_SEQID_RANGE. If the "seqid" value is outside of that range, + the error NFS4ERR_OLD_STATEID is returned to the client. Upon + receipt of NFS4ERR_OLD_STATEID, the client updates the stateid in the + layout request based on processing of other layout requests and re- + sends the operation to the server. 12.5.5.2.1.5. Bulk Recall and Return PNFS supports recalling and returning all layouts that are for files belonging to a particular fsid (LAYOUTRECALL4_FSID, LAYOUTRETURN4_FSID) or client ID (LAYOUTRECALL4_ALL, LAYOUTRETURN4_ALL). There are no "bulk" stateids, so detection of races via the seqid is not possible. The server MUST NOT initiate bulk recall while another recall is in progress, or the corresponding LAYOUTRETURN is in progress or pending. In the event the server @@ -16273,21 +16305,22 @@ | | NFS4ERR_FHEXPIRED, NFS4ERR_INVAL, | | | NFS4ERR_IO, NFS4ERR_MOVED, | | | NFS4ERR_NOFILEHANDLE, | | | NFS4ERR_OP_NOT_IN_SESSION, | | | NFS4ERR_REP_TOO_BIG, | | | NFS4ERR_REP_TOO_BIG_TO_CACHE, | | | NFS4ERR_REQ_TOO_BIG, NFS4ERR_SERVERFAULT, | | | NFS4ERR_STALE, NFS4ERR_TOO_MANY_OPS | | BACKCHANNEL_CTL | NFS4ERR_BADXDR, NFS4ERR_DEADSESSION, | | | NFS4ERR_DELAY, NFS4ERR_INVAL, | - | | NFS4ERR_NOENT, NFS4ERR_REP_TOO_BIG, | + | | NFS4ERR_NOENT, NFS4ERR_OP_NOT_IN_SESSION, | + | | NFS4ERR_REP_TOO_BIG, | | | NFS4ERR_REP_TOO_BIG_TO_CACHE, | | | NFS4ERR_REQ_TOO_BIG, NFS4ERR_TOO_MANY_OPS | | BIND_CONN_TO_SESSION | NFS4ERR_BADSESSION, NFS4ERR_BADXDR, | | | NFS4ERR_BAD_SESSION_DIGEST, | | | NFS4ERR_DEADSESSION, NFS4ERR_DELAY, | | | NFS4ERR_INVAL, NFS4ERR_NOT_ONLY_OP, | | | NFS4ERR_REP_TOO_BIG, | | | NFS4ERR_REP_TOO_BIG_TO_CACHE, | | | NFS4ERR_REQ_TOO_BIG, NFS4ERR_SERVERFAULT, | | | NFS4ERR_TOO_MANY_OPS | @@ -17303,23 +17336,23 @@ | NFS4ERR_NO_GRACE | LAYOUTCOMMIT, LAYOUTRETURN, | | | LOCK, OPEN, WANT_DELEGATION | | NFS4ERR_OLD_STATEID | CLOSE, DELEGRETURN, | | | FREE_STATEID, LAYOUTGET, | | | LAYOUTRETURN, LOCK, LOCKU, | | | OPEN, OPEN_DOWNGRADE, READ, | | | SETATTR, WRITE | | NFS4ERR_OPENMODE | LAYOUTGET, LOCK, READ, | | | SETATTR, WRITE | | NFS4ERR_OP_ILLEGAL | CB_ILLEGAL, ILLEGAL | - | NFS4ERR_OP_NOT_IN_SESSION | ACCESS, CB_GETATTR, | - | | CB_LAYOUTRECALL, CB_NOTIFY, | - | | CB_NOTIFY_LOCK, | + | NFS4ERR_OP_NOT_IN_SESSION | ACCESS, BACKCHANNEL_CTL, | + | | CB_GETATTR, CB_LAYOUTRECALL, | + | | CB_NOTIFY, CB_NOTIFY_LOCK, | | | CB_PUSH_DELEG, CB_RECALL, | | | CB_RECALLABLE_OBJ_AVAIL, | | | CB_RECALL_ANY, | | | CB_RECALL_SLOT, | | | CB_WANTS_CANCELLED, CLOSE, | | | COMMIT, CREATE, DELEGPURGE, | | | DELEGRETURN, FREE_STATEID, | | | GETATTR, GETDEVICEINFO, | | | GETDEVICELIST, GETFH, | | | GET_DIR_DELEGATION, | @@ -21936,22 +21969,22 @@ number of bytes written starting at location, offset, is returned. The server also returns an indication of the level of commitment of the data and metadata via committed. Per Table 20, o The server MAY commit the data at a stronger level than requested. o The server MUST commit the data at a level at least as high as that committed. - Valid combinations of the stable in the request and committed in the - reply. + Valid combinations of the fields stable in the request and committed + in the reply. +------------+-----------------------------------+ | stable | committed | +------------+-----------------------------------+ | UNSTABLE4 | FILE_SYNC4, DATA_SYNC4, UNSTABLE4 | | DATA_SYNC4 | FILE_SYNC4, DATA_SYNC4 | | FILE_SYNC4 | FILE_SYNC4 | +------------+-----------------------------------+ Table 20 @@ -22037,21 +22070,21 @@ Some implementations may return NFS4ERR_NOSPC instead of NFS4ERR_DQUOT when a user's quota is exceeded. In the case that the current filehandle is of type NF4DIR, the server will return NFS4ERR_ISDIR. If the current file is a symbolic link, the error NFS4ERR_SYMLINK will be returned. Otherwise, if the current filehandle does not designate an ordinary file, the server will return NFS4ERR_WRONG_TYPE. - If mandatory file locking is effect for the file, and the + If mandatory file locking is in effect for the file, and the corresponding byte-range of the data to be written to the file is read or write locked by an owner that is not associated with the stateid, the server MUST return NFS4ERR_LOCKED. If so, the client MUST check if the owner corresponding to the stateid used with the WRITE operation has a conflicting read lock that overlaps with the region that was to be written. If the stateid's owner has no conflicting read lock, then the client SHOULD try to get the appropriate write byte-range lock via the LOCK operation before re- attempting the WRITE. When the WRITE completes, the client SHOULD release the byte-range lock via LOCKU. @@ -22113,29 +22146,29 @@ nfsstat4 bcr_status; }; 18.33.3. DESCRIPTION The BACKCHANNEL_CTL operation replaces the backchannel's callback program number and adds (not replaces) RPCSEC_GSS contexts for use by the backchannel. The arguments of the BACKCHANNEL_CTL call are a subset of the - CREATE_SESSION parameters. In the arguments to BACKCHANNEL_CTL, the + CREATE_SESSION parameters. In the arguments of BACKCHANNEL_CTL, the bca_cb_program field and bca_sec_parms fields correspond respectively - to the csa_cb_program and csa_sec_parms of the arguments to + to the csa_cb_program and csa_sec_parms fields of the arguments of CREATE_SESSION (Section 18.36). BACKCHANNEL_CTL MUST appear in a COMPOUND that starts with SEQUENCE. If the RPCSEC_GSS handle identified by gcbp_handle_from_server does - not exist on the server, the server will return NFS4ERR_NOENT. + not exist on the server, the server MUST return NFS4ERR_NOENT. 18.34. Operation 41: BIND_CONN_TO_SESSION 18.34.1. ARGUMENT enum channel_dir_from_client4 { CDFC4_FORE = 0x1, CDFC4_BACK = 0x2, CDFC4_FORE_OR_BOTH = 0x3, CDFC4_BACK_OR_BOTH = 0x7 @@ -22176,87 +22209,105 @@ default: void; }; 18.34.3. DESCRIPTION BIND_CONN_TO_SESSION is used to associate additional connections with a session. It MUST be used on the connection being associated with the session. It MUST be the only operation in the COMPOUND procedure. If SP4_NONE (Section 18.35) state protection is used, any - principal, security flavor, or RPCSEC_GSS context can invoke the - operation. If SP4_MACH_CRED is used, RPCSEC_GSS must be used with - the integrity or privacy services, using the principal that created - the client ID. If SP4_SSV is used, RPCSEC_GSS with the SSV GSS - mechanism (Section 2.10.8) and integrity or privacy MUST be used. + principal, security flavor, or RPCSEC_GSS context MAY be used to + invoke the operation. If SP4_MACH_CRED is used, RPCSEC_GSS MUST be + used with the integrity or privacy services, using the principal that + created the client ID. If SP4_SSV is used, RPCSEC_GSS with the SSV + GSS mechanism (Section 2.10.8) and integrity or privacy MUST be used. - If when the client ID was created, the client opted for SP4_NONE + If, when the client ID was created, the client opted for SP4_NONE state protection, the client is not required to use BIND_CONN_TO_SESSION to associate the connection with the session, unless the client wishes to associate the connection with the backchannel. When SP4_NONE protection is used, simply sending a - COMPOUND with a SEQUENCE operation that is sufficient to associate + COMPOUND request with a SEQUENCE operation is sufficient to associate the connnection with the session specified in SEQUENCE. The field bctsa_dir indicates whether the client wants to associate the connection with the fore channel or the backchannel or both channels. The value CDFC4_FORE_OR_BOTH indicates the client wants to - associate with both the fore channel and backchannel, but will accept - the connection being associated to just the fore channel. The value - CDFC4_BACK_OR_BOTH indicates the client wants to associate with both - the fore and backchannel, but will accept the connection being - associated with the backchannel. The server replies in bctsr_dir - which channel(s) the connection is associated with. If the client - specified CDFC4_FORE, the server MUST return CDFS4_FORE. If the - client specified CDFC4_BACK, the server MUST return CDFS4_BACK. If - the client specified CDFC4_FORE_OR_BOTH, the server MUST return - CDFS4_FORE or CDFS4_BOTH. If the client specified + associate the connection with both the fore channel and backchannel, + but will accept the connection being associated to just the fore + channel. The value CDFC4_BACK_OR_BOTH indicates the client wants to + associate with both the fore and backchannel, but will accept the + connection being associated with just the backchannel. The server + replies in bctsr_dir which channel(s) the connection is associated + with. If the client specified CDFC4_FORE, the server MUST return + CDFS4_FORE. If the client specified CDFC4_BACK, the server MUST + return CDFS4_BACK. If the client specified CDFC4_FORE_OR_BOTH, the + server MUST return CDFS4_FORE or CDFS4_BOTH. If the client specified CDFC4_BACK_OR_BOTH, the server MUST return CDFS4_BACK or CDFS4_BOTH. See the CREATE_SESSION operation (Section 18.36), and the description of the argument csa_use_conn_in_rdma_mode to understand bctsa_use_conn_in_rdma_mode, and the description of csr_use_conn_in_rdma_mode to understand bctsr_use_conn_in_rdma_mode. Invoking BIND_CONN_TO_SESSION on a connection already associated with - the specified session has no effect, and the server SHOULD respond - with NFS4_OK. + the specified session has no effect, and the server MUST respond with + NFS4_OK, unless the client is demanding changes to the set of + channels the connection is associated with. If so, the server MUST + return NFS4ERR_INVAL. 18.34.4. IMPLEMENTATION - If a session's channel loses all connections, the client needs to use - BIND_CONN_TO_SESSION to associate a new connection. If the server - restarted and does not keep the reply cache in stable storage, the - server will not recognize the sessionid. The client will ultimately - have to invoke EXCHANGE_ID to create a new client ID and session. + If a session's channel loses all connections, depending on the client + ID's state protection and type of channel, the client might need to + use BIND_CONN_TO_SESSION to associate a new connection. If the + server restarted and does not keep the reply cache in stable storage, + the server will not recognize the sessionid. The client will + ultimately have to invoke EXCHANGE_ID to create a new client ID and + session. - Assuming SP4_SSV state protection is being used, there is an issue if - SET_SSV is sent, no response is returned, and the last connection - associated with the client ID disconnects. The client, per the - sessions model, needs to retry the SET_SSV. But it needs a new - connection to do so, and needs to associate that connection with the - session via a BIND_CONN_TO_SESSION authenticated with the SSV GSS + Suppose SP4_SSV state protection is being used, and + BIND_CONN_TO_SESSION is among the operations included in the + spo_must_enforce set when the client ID was created (Section 18.35). + If so, there is an issue if SET_SSV is sent, no response is returned, + and the last connection associated with the client ID drops. The + client, per the sessions model, MUST retry the SET_SSV. But it needs + a new connection to do so, and MUST associate that connection with + the session via a BIND_CONN_TO_SESSION authenticated with the SSV GSS mechanism. The problem is that the RPCSEC_GSS message integrity codes use a subkey derived from the SSV as the key and the SSV may have changed. While there are multiple recovery strategies, a - single, general strategy is described here. First the client - reconnects. The client assumes the SET_SSV was executed, and so - sends BIND_CONN_TO_SESSION with the subkey derived from new SSV used - as key for the message integrity code in the RPCSEC_GSS credential - message integrity codes. If the server returns an RPC authentication - error, this means the server's current SSV was not changed, and the - SET_SSV was not executed. The client then tries BIND_CONN_TO_SESSION - with the subkey derived from the old SSV as the key for the - RPCSEC_GSS message integrity code. This should not return an RPC - authentication error. If it does, an implementation error has - occurred on either the client or server, and the client has to create - a new client ID. + single, general strategy is described here. + + o The client reconnects. + + o The client assumes the SET_SSV was executed, and so sends + BIND_CONN_TO_SESSION with the subkey derived from new SSV (what + SET_SSV would have set the SSV to) used as the key for the + RPCSEC_GSS credential message integrity codes. + + o If the request succeeds, this means the original attempted SET_SSV + did execute successfully. The client re-sends the original + SET_SSV, which the server will reply from via the the reply cache. + + o If the server returns an RPC authentication error, this means the + server's current SSV was not changed, (and the SET_SSV was likely + not executed). The client then tries BIND_CONN_TO_SESSION with + the subkey derived from the old SSV as the key for the RPCSEC_GSS + message integrity codes. + + o The attempted BIND_CONN_TO_SESSION with the old SSV should + succeed. If so the client re-sends the original SET_SSV. If the + original SET_SSV was not executed, then the server executes it. + If the original SET_SSV was executed, but failed, the server will + return the SET_SSV from the reply cache. 18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID Exchange long hand client and server identifiers (owners), and create a client ID 18.35.1. ARGUMENT const EXCHGID4_FLAG_SUPP_MOVED_REFER = 0x00000001; const EXCHGID4_FLAG_SUPP_MOVED_MIGR = 0x00000002; @@ -22369,37 +22420,37 @@ EXCHANGE_ID sent with a new incarnation of the client will lead to the server removing lock state of the old incarnation. Whereas an EXCHANGE_ID sent with the current incarnation and co_ownerid will result in an error or an update of the client ID's properties, depending on the arguments to EXCHANGE_ID. A server MUST NOT use the same client ID for two different incarnations of an eir_clientowner. In addition to the client ID and sequence id, the server returns a - server owner (eir_server_owner) and eir_server_scope. The former - field is used for network trunking as described in Section 2.10.4. - The latter field is used to allow clients to determine when clientids - sent by one server may be recognized by another in the event of file - system migration (see Section 11.7.7). + server owner (eir_server_owner) and server scope (eir_server_scope). + The former field is used for network trunking as described in + Section 2.10.4. The latter field is used to allow clients to + determine when client IDs sent by one server may be recognized by + another in the event of file system migration (see Section 11.7.7). The client ID returned by EXCHANGE_ID is only unique relative to the combination of eir_server_owner.so_major_id and eir_server_scope. Thus if two servers return the same client ID, the onus is on the client to distinguish the client IDs on the basis of eir_server_owner.so_major_id and eir_server_scope. In the event two different server's claim matching server_owner.so_major_id and eir_server_scope, the client can use the verification techniques discussed in Section 2.10.4 to determine if the servers are distinct. If they are distinct, then the client will need to note the destination network addresses of the connections used with each - server, and use network address as the final discriminator. + server, and use the network address as the final discriminator. The server, as defined by the unique identity expressed in the so_major_id of the server owner and the server scope, needs to track several properties of each client ID it hands out. The properties apply to the client ID and all sessions associated with the client ID. The properties are derived from the arguments and results of EXCHANGE_ID. The client ID properties include: o The capabilities expressed by the following bits, which come from the results of EXCHANGE_ID: @@ -22445,41 +22496,45 @@ EXCHANGE_ID arguments. Once the client ID is confirmed, this property cannot be updated by subsequent EXCHANGE_ID requests. * The OID of the encryption algorithm. This property is represented by one of the algorithms in the ssp_encr_algs field of the EXCHANGE_ID arguments. Once the client ID is confirmed, this property cannot be updated by subsequent EXCHANGE_ID requests. * The length of the SSV. This property is represented by the - spi_ssv_len in the EXCHANGE_ID results. Once the client ID is - confirmed, this property cannot be updated by subsequent + spi_ssv_len field in the EXCHANGE_ID results. Once the client + ID is confirmed, this property cannot be updated by subsequent EXCHANGE_ID requests. The length of SSV MUST be equal to the length of the key used by the negotiated encryption algorithm. * Number of concurrent versions of the SSV the client and server will support (Section 2.10.8). This property is represented by spi_window, in the EXCHANGE_ID results. The property may be updated by subsequent EXCHANGE_ID requests. o The client's implementation ID as represented by the eia_client_impl_id field of the arguments. The property may be updated by subsequent EXCHANGE_ID requests. + o The server's implementation ID as represented by the + eir_server_impl_id field of the reply. The property may be + updated by replies to subsequent EXCHANGE_ID requests. + The eia_flags passed as part of the arguments and the eir_flags results allow the client and server to inform each other of their capabilities as well as indicate how the client ID will be used. Whether a bit is set or cleared on the arguments' flags does not force the server to set or clear the same bit on the results' side. - Bits not defined above should not be set in the eia_flags field. If - they are, the server MUST reject the operation with NFS4ERR_INVAL. + Bits not defined above cannot be set in the eia_flags field. If they + are, the server MUST reject the operation with NFS4ERR_INVAL. The EXCHGID4_FLAG_UPD_CONFIRMED_REC_A bit can only be set in eia_flags; it is always off in eir_flags. The EXCHGID4_FLAG_CONFIRMED_R bit can only be set in eir_flags; it is always off in eia_flags. If the server recognizes the co_ownerid and co_verifier as mapping to a confirmed client ID, it sets EXCHGID4_FLAG_CONFIRMED_R in eir_flags. The EXCHGID4_FLAG_CONFIRMED_R flag allows a client to tell if the client ID it is trying to create already exists and is confirmed. @@ -22498,23 +22553,24 @@ Section 18.35.4 will apply. Note that if the operation succeeds and returns a client ID that is already confirmed, the server MUST set the EXCHGID4_FLAG_CONFIRMED_R bit in eir_flags. If EXCHGID4_FLAG_UPD_CONFIRMED_REC_A is not set in eia_flags, this means the client is trying to establish a new client ID; it is attempting to trunk data communication to the server (Section 2.10.4); or it is attempting to update properties of an unconfirmed client ID. The situations described in Sub-Paragraph 1, Sub-Paragraph 2, Sub-Paragraph 3, Sub-Paragraph 4, or Sub-Paragraph 5 - of Paragraph 6 in Section 18.35.4) will apply. Note that if the - operation succeeds and returns a client ID that is already confirmed, - the server MUST set the EXCHGID4_FLAG_CONFIRMED_R bit in eir_flags. + of Paragraph 6 in Section 18.35.4 will apply. Note that if the + operation succeeds and returns a client ID that was previously + confirmed, the server MUST set the EXCHGID4_FLAG_CONFIRMED_R bit in + eir_flags. When the EXCHGID4_FLAG_SUPP_MOVED_REFER flag bit is set, the client indicates that it is capable of dealing with an NFS4ERR_MOVED error as part of a referral sequence. When this bit is not set, it is still legal for the server to perform a referral sequence. However, a server may use the fact that the client is incapable of correctly responding to a referral, by avoiding it for that particular client. It may, for instance, act as a proxy for that particular file system, at some cost in performance, although it is not obligated to do so. If the server will potentially perform a referral, it MUST set @@ -22624,21 +22678,22 @@ operations the server will require SP4_MACH_CRED or SP4_SSV protection for. Normally the server's result equals the client's argument, but the result MAY be different. If the client requests one or more operations in the set { EXCHANGE_ID, CREATE_SESSION, DELEGPURGE, DESTROY_SESSION, BIND_CONN_TO_SESSION, DESTROY_CLIENTID }, then the result spo_must_enforce MUST include the operations the client requested from that set. If spo_must_enforce in the results has BIND_CONN_TO_SESSION set, then connection binding enforcement is enabled, and the client MUST use - the machine or SSV credential on calls to BIND_CONN_TO_SESSION. + the machine (if SP4_MACH_CRED protection is used) or SSV (if SP4_SSV + protection is used) credential on calls to BIND_CONN_TO_SESSION. The second list is spo_must_allow and consists of those operations the client wants to have the option of issuing with the machine credential or the SSV-based credential, even if the object the operations are performed on is not owned by the machine or SSV credential. The corresponding result, also called spo_must_allow, consists of the operations the server will allow the client to use SP4_SSV or SP4_MACH_CRED credentials with. Normally the server's result equals @@ -22681,28 +22736,28 @@ algorithm for a server is id-aes256-CBC. The RECOMMENDED algorithms are id-aes192-CBC and id-aes128-CBC [19]. The selected algorithm is returned in spi_encr_alg, an index into ssp_encr_algs. If the server does not support any of the offered algorithms, it returns NFS4ERR_ENCR_ALG_UNSUPP. If ssp_encr_algs is empty, the server MUST return NFS4ERR_INVAL. ssp_window: This is the number of SSV versions the client wants the server to - maintain (i.e. each call to SET_SSV produces a new version of the - SSV). If ssp_window is zero, the server MUST return - NFS4ERR_INVAL. The server responds with spi_window, which MUST - NOT exceed ssp_window, and MUST be at least one (1). Any requests - on the backchannel or forechannel that are using a version of the - SSV that is outside the window will fail with an ONC RPC - authentication error, and the requester will have to retry them - with the same slot id and sequence id. + maintain (i.e. each successful call to SET_SSV produces a new + version of the SSV). If ssp_window is zero, the server MUST + return NFS4ERR_INVAL. The server responds with spi_window, which + MUST NOT exceed ssp_window, and MUST be at least one (1). Any + requests on the backchannel or fore channel that are using a + version of the SSV that is outside the window will fail with an + ONC RPC authentication error, and the requester will have to retry + them with the same slot id and sequence id. ssp_num_gss_handles: This is the number of RPCSEC_GSS handles the server should create that are based on the GSS SSV mechanism (Section 2.10.8). It is not the total number of RPCSEC_GSS handles for the client ID. Indeed, subsequent calls to EXCHANGE_ID will add RPCSEC_GSS handles. The server responds with a list of handles in spi_handles. If the client asks for at least one handle and the server cannot create it, the server MUST return an error. The @@ -22903,21 +22959,21 @@ returns NFS4ERR_CLID_INUSE to indicate the client should retry with a different value for the eia_clientowner.co_ownerid subfield of EXCHANGE_ID4args. The client record is not changed. 4. Replacement of Unconfirmed Record If the EXCHGID4_FLAG_UPD_CONFIRMED_REC_A flag is not set, and the server has the following unconfirmed record then the client is attempting EXCHANGE_ID again on an unconfirmed - client ID, perhaps do to a retry, or perhaps due to a client + client ID, perhaps due to a retry, or perhaps due to a client restart before client ID confirmation (i.e. before CREATE_SESSION was called), or some other reason. { ownerid_arg, *, *, old_clientid_ret, unconfirmed } It is possible the properties of old_clientid_ret are different than those specified in the current EXCHANGE_ID. Whether the properties are being updated or not, to eliminate ambiguity, the server deletes the unconfirmed record, generates a new client ID (clientid_ret) and establishes the @@ -22942,25 +22999,43 @@ to wait for the lease time on the previous incarnation to expire. Furthermore, session state should be removed since if the client had maintained that information across restart, this request would not have been sent. If the server does not support the CLAIM_DELEGATE_PREV claim type, associated delegations should be purged as well; otherwise, delegations are retained and recovery proceeds according to Section 10.2.1. After processing, clientid_ret is returned to the client and - the client record is replaced with: + this client record is added: { ownerid_arg, verifier_arg, principal_arg, clientid_ret, unconfirmed } + The previously described confirmed record continues to exist, + and thus the same ownerid_arg exists in both a confirmed and + unconfirmed state at the same time. The number of states can + collapse to one once the server receives an applicable + CREATE_SESSION or EXCHANGE_ID. + + + If the server subsequently receives a successful + CREATE_SESSION that confirms clientid_ret, then the server + atomically destroys the confirmed record and makes the + unconfirmed record confirmed as described in + Section 18.36.4. + + + If the server instead subsequently receives an EXCHANGE_ID + with the client owner equal to ownerid_arg, one strategy is + to simply delete the unconfirmed record, and process the + EXCHANGE_ID as described in the entirety of + Section 18.35.4. + 6. Update If EXCHGID4_FLAG_UPD_CONFIRMED_REC_A is set, and the server has the following confirmed record, then this request is an attempt at an update. { ownerid_arg, verifier_arg, principal_arg, clientid_ret, confirmed } Since the record has been confirmed, the client must have @@ -23075,35 +23149,35 @@ returns the parameter values for the new session. o The connection CREATE_SESSION is sent over is associated with the session's fore channel. The arguments and results of CREATE_SESSION are described as follows: csa_clientid: This is the client ID the new session will be associated with. - The corresponding result is csr_sessionid, sessionid of the new - session. + The corresponding result is csr_sessionid, the sessionid of the + new session. csa_sequence: Each client ID serializes CREATE_SESSION via a per client ID - sequence number. See Section 18.36.4. The corresponding result + sequence number (see Section 18.36.4). The corresponding result is csr_sequence, which MUST be equal to csa_sequence. In the next three arguments, the client offers a value that is to be a property of the session. It is RECOMMENDED that the server accept the value. If it is not acceptable, the server MAY use a different value. Regardless, the server MUST return the value the session will - uses (which will be either what the client offered, or what the - server is insisting on). return the value used to the client. These + use (which will be either what the client offered, or what the server + is insisting on). return the value used to the client. These parameters have the following interpretation. csa_flags: The csa_flags field contains a list of the following flag bits: CREATE_SESSION4_FLAG_PERSIST: If CREATE_SESSION4_FLAG_PERSIST is set, the client wants the server to provide a persistent reply cache. For sessions in @@ -23125,37 +23199,35 @@ CREATE_SESSION is called over for the backchannel as well as the fore channel. The server sets CREATE_SESSION4_FLAG_CONN_BACK_CHAN in the result field csr_flags if it agrees. If CREATE_SESSION4_FLAG_CONN_BACK_CHAN is not set in csa_flags, then CREATE_SESSION4_FLAG_CONN_BACK_CHAN MUST NOT be set in csr_flags. CREATE_SESSION4_FLAG_CONN_RDMA: - If CREATE_SESSION4_FLAG_CONN_RDMA is set in csa_flags, the - connection CREATE_SESSION is called over is currently in non- - RDMA mode, but has the capability to operate in RDMA mode, then - client is requesting the server agree to "step up" to RDMA mode - on the connection. The server sets + If CREATE_SESSION4_FLAG_CONN_RDMA is set in csa_flags, and if + the connection CREATE_SESSION is called over is currently in + non-RDMA mode, but has the capability to operate in RDMA mode, + then client is requesting the server agree to "step up" to RDMA + mode on the connection. The server sets CREATE_SESSION4_FLAG_CONN_RDMA in the result field csr_flags if it agrees. If CREATE_SESSION4_FLAG_CONN_RDMA is not set in csa_flags, then CREATE_SESSION4_FLAG_CONN_RDMA MUST NOT be set in csr_flags. Note that once the server agrees to step up, it and the client MUST exchange all future traffic on the connection with RPC RDMA framing and not Record Marking ([8]). - csa_fore_chan_attrs: - - csa_back_chan_attrs: + csa_fore_chan_attrs, csa_fore_chan_attrs: - The csa_fore_char_attrs and csa_back_chan_attrs fields apply to + The csa_fore_chan_attrs and csa_back_chan_attrs fields apply to attributes of the fore channel (which conveys requests originating from the client to the server), and the backchannel (the channel that conveys callback requests originating from the server to the client), respectively. The results are in corresponding structures called csr_fore_chan_attrs and csr_back_chan_attrs. The results establish attributes for each channel, and on all subsequent use of each channel of the session. Each structure has the following fields: ca_headerpadsize: @@ -23176,30 +23248,30 @@ TCP/IP connection, and that it has a single Record Marking header preceding it. The maximum allowable count encoded in the header will be ca_maxrequestsize. If a requester sends a request that exceeds ca_maxrequestsize, the error NFS4ERR_REQ_TOO_BIG will be returned per the description in Section 2.10.5.4. ca_maxresponsesize: The maximum size of a COMPOUND or CB_COMPOUND reply that the - replier will accept from the requester including RPC headers + requester will accept from the replier including RPC headers (see the ca_maxrequestsize definition). The NFSv4.1 server MUST NOT increase the value of this parameter in the CREATE_SESSION results. However, if the client selects a value for ca_maxresponsesize such that a replier on a channel could never send a response, the server SHOULD return - NFS4ERR_TOOSMALL to in the CREATE_SESSION reply. If a - requester sends a request for which the size of the reply would - exceed this value, the replier will return NFS4ERR_REP_TOO_BIG, - per the description in Section 2.10.5.4. + NFS4ERR_TOOSMALL in the CREATE_SESSION reply. If a requester + sends a request for which the size of the reply would exceed + this value, the replier will return NFS4ERR_REP_TOO_BIG, per + the description in Section 2.10.5.4. ca_maxresponsesize_cached: Like ca_maxresponsesize, but the maximum size of a reply that will be stored in the reply cache (Section 2.10.5.1). If the reply to CREATE_SESSION has ca_maxresponsesize_cached less than ca_maxresponsesize, then this is an indication to the requester on the channel that it needs to be selective about which replies it directs the replier to cache; for example large replies from nonidempotent operations (e.g. COMPOUND requests @@ -23260,32 +23332,33 @@ There is no corresponding result. The RPCSEC_GSS context for the backchannel is specified via a pair of values of data type gsshandle4_t. The data type gsshandle4_t represents an RPCSEC_GSS handle, and is precisely the same as the data type of the "handle" field of the rpc_gss_init_res data type defined in Section 5.2.3.1, "Context Creation Response - Successful Acceptance" of [4]. The first RPCSEC_GSS handle, gcbp_handle_from_server, is the fore - handle the server returned to the client (in the handle field of - data type rpc_gss_init_res) when the RPCSEC_GSS context was - created on the server. The second handle, + handle the server returned to the client (either in the handle + field of data type rpc_gss_init_res or one of the elements of the + spi_handles field returned in the reply to EXCHANGE_ID) when the + RPCSEC_GSS context was created on the server. The second handle, gcbp_handle_from_client, is the back handle the client will map the RPCSEC_GSS context to. The server can immediately use the value of gcbp_handle_from_client in the RPCSEC_GSS credential in callback RPCs. I.e., the value in gcbp_handle_from_client can be used as the value of the field "handle" in data type rpc_gss_cred_t (see Section 5, "Elements of the RPCSEC_GSS - Security Protocol" of [4]) in callback RPCs. The server must use + Security Protocol" of [4]) in callback RPCs. The server MUST use the RPCSEC_GSS security service specified in gcbp_service, i.e. it - must set the "service" field of the rpc_gss_cred_t data type in + MUST set the "service" field of the rpc_gss_cred_t data type in RPCSEC_GSS credential to the value of gcbp_service (see Section 5.3.1, "RPC Request Header", of [4]). If the RPCSEC_GSS handle identified by gcbp_handle_from_server does not exist on the server, the server will return NFS4ERR_NOENT. Note that while the GSS context state is shared between the fore and back RPCSEC_GSS contexts, the fore and back RPCSEC_GSS context state are independent of each other as far as the RPCSEC_GSS @@ -23317,27 +23390,28 @@ CREATE_SESSION requests for a given client ID. o Second, the size of the client ID reply cache is of one slot (and as a result, the CREATE_SESSION request does not carry a slot number). This means that at most one CREATE_SESSION request for a given client ID can be outstanding. When a client sends a successful EXCHANGE_ID and it is returned an unconfirmed client ID, the client is also returned eir_sequenceid, and the client is expected to set the value of csa_sequenceid in the - client ID confirming CREATE_SESSION it sends with that client ID to - the value of eir_sequenceid. After EXCHANGE_ID, the server - initializes the client ID slot to be equal to eir_sequenceid - 1 - (accounting for underflow), and records a contrived CREATE_SESSION - result with a "cached" result of NFS4ERR_SEQ_MISORDERED. With the - slot thus initialized, the processing of the CREATE_SESSION operation - is divided into four phases: + client ID-confirming-CREATE_SESSION it sends with that client ID to + the value of eir_sequenceid. When EXCHANGE_ID returns a new, + unconfirmed client ID, the server initializes the client ID slot to + be equal to eir_sequenceid - 1 (accounting for underflow), and + records a contrived CREATE_SESSION result with a "cached" result of + NFS4ERR_SEQ_MISORDERED. With the slot thus initialized, the + processing of the CREATE_SESSION operation is divided into four + phases: 1. Client record lookup. The server looks up the client ID in its client record table. If the server contains no records with client ID equal to clientid_arg, then most likely the client's state has been purged during a period of inactivity, possibly due to a loss of connectivity. NFS4ERR_STALE_CLIENTID is returned, and no changes are made to any client records on the server. Otherwise, the server goes to phase 2. 2. Sequence id processing. If csa_sequenceid is equal to the @@ -23358,28 +23432,40 @@ client ID. Otherwise the client ID confirmation phase is skipped and only the session creation phase occurs. Any case in which there is more than one record with identical values for client ID represents a server implementation error. Operation in the potential valid cases is summarized as follows. * Successful Confirmation If the server has the following unconfirmed record, then this is the expected confirmation of an unconfirmed record. - { *, *, principal_arg, clientid_arg, unconfirmed } + { ownerid, verifier, principal_arg, clientid_arg, + unconfirmed } - The record is replaced with: + As noted in Section 18.35.4, the server might also have the + following confirmed record. - { *, *, principal_arg, clientid_arg, confirmed } + { ownerid, old_verifier, principal_arg, old_clientid, + confirmed } - The processing of the operation continues to session - creation. + The server schedules the replacement of both records are + atomically replaced with: + + { ownerid, verifier, principal_arg, clientid_arg, confirmed + } + + The processing of CREATE_SESSION continues on to session + creation. Once the session is successfully created, the + scheduled client record replacement is committed. If the + session is not successfully created, then no changes are + made to any client records on the server. * Unsuccessful Confirmation If the server has the following record, then the client has changed principals after the previous EXCHANGE_ID request, or there has been a chance collision between shorthand client identifiers. { *, *, old_principal_arg, clientid_arg, * } @@ -23388,42 +23474,44 @@ changes are made to any client records on the server. 4. Session creation. The server confirmed the client ID, either in this CREATE_SESSION operation, or a previous CREATE_SESSION operation. The server examines the remaining fields of the arguments. 5. The server creates the session by recording the parameter values used (including whether the CREATE_SESSION4_FLAG_PERSIST flag is set and has been accepted by the server) and allocating space for - the session reply cache. For each slot in the reply cache, the + the session reply cache (if there is not enough space, the server + returns NFS4ERR_NOSPC). For each slot in the reply cache, the server sets the sequence id to zero (0), and records an entry - containing a COMPOUND reply with a zero operations and the error - of NFS4ERR_SEQ_MISORDERED. This way, if the first SEQUENCE - request sent has a sequenceid equal to zero, the server can - simply return what is in the reply cache: NFS4ERR_SEQ_MISORDERED. - The client initializes its reply cache for receiving callbacks in - the same way, and similarly, the first CB_SEQUENCE operation on a + containing a COMPOUND reply with zero operations and the error + NFS4ERR_SEQ_MISORDERED. This way, if the first SEQUENCE request + sent has a sequence id equal to zero, the server can simply + return what is in the reply cache: NFS4ERR_SEQ_MISORDERED. The + client initializes its reply cache for receiving callbacks in the + same way, and similarly, the first CB_SEQUENCE operation on a slot after session creation must have a sequence id of one. 6. If the session state is created successfully, the server associates the session with the client ID provided by the client. 7. When a request that had CREATE_SESSION4_FLAG_CONN_RDMA set needs to be retried, the retry MUST be done on a new connection that is in non-RDMA mode. If properties of the new connection are different enough that the arguments to CREATE_SESSION must change, then a non-retry MUST be sent. The server will - eventually dispose of any session that was created. + eventually dispose of any session that was created on the + original connection. On the backchannel, the client and server might wish to have many - slots, in some cases perhaps more that the fore channel in to deal + slots, in some cases perhaps more that the fore channel, in to deal with the situations where the network link has high latency and is the primary bottleneck for response to recalls. If so, and if the client provides too few slots to the backchannel, the server might limit the number of recallable objects it gives to the server. Implementing RPCSEC_GSS callback support requires the client and server change their RPCSEC_GSS implementations. One possible set of changes includes: o Adding a data structure that wraps the GSS-API context with a @@ -23443,23 +23531,20 @@ be incremented. o Adding a function to create a new RPCSEC_GSS handle from a pointer to the wrapper data structure. The reference count would be incremented. o Replacing calls from RPCSEC_GSS that free GSS-API contexts, with calls to decrement the reference count on the wrapper data structure. - If the server cannot reserve space for the reply cache, it MAY return - NFS4ERR_NOSPC. - 18.37. Operation 44: DESTROY_SESSION - Destroy existing session Destroy existing session. 18.37.1. ARGUMENT struct DESTROY_SESSION4args { sessionid4 dsa_sessionid; }; @@ -23543,24 +23628,20 @@ When a stateid is freed which had been associated with revoked locks, the client, by doing the FREE_STATEID acknowledges the loss of those locks. This allows the server, once all such revoked state is acknowledged, to allow that client again to reclaim locks, without encountering the edge conditions discussed in Section 8.4.2. Once a successful FREE_STATEID is done for a given stateid, any subsequent use of that stateid will result in an NFS4ERR_BAD_STATEID error. -18.38.4. IMPLEMENTATION - - No discussion at this time. - 18.39. Operation 46: GET_DIR_DELEGATION - Get a directory delegation Obtain a directory delegation. 18.39.1. ARGUMENT typedef nfstime4 attr_notice4; struct GET_DIR_DELEGATION4args { /* CURRENT_FH: delegated directory */ @@ -23619,53 +23700,53 @@ delegation, the delegation will be recalled unless the client has asked for notification for this event. The server will also return a directory delegation stateid, gddr_stateid, as a result of the GET_DIR_DELEGATION operation. This stateid will appear in callback messages related to the delegation, such as notifications and delegation recalls. The client will use this stateid to return the delegation voluntarily or upon recall. A delegation is returned by calling the DELEGRETURN operation. - The server may not be able to support notifications of certain - events. If the client asks for such notifications, the server must + The server might not be able to support notifications of certain + events. If the client asks for such notifications, the server MUST inform the client of its inability to do so as part of the GET_DIR_DELEGATION reply by not setting the appropriate bits in the supported notifications bitmask, gddr_notification, contained in the - reply. The server should not add bits to gddr_notification that the + reply. The server MUST NOT add bits to gddr_notification that the client did not request. The GET_DIR_DELEGATION operation can be used for both normal and named attribute directories. If client sets gdda_signal_deleg_avail to TRUE, then it is registering with the client a "want" for a directory delegation. If the delegation is not available, and the server supports and will honor the "want", the results will have gddrnf_will_signal_deleg_avail set to TRUE and no error will be indicated on return. If so the client should expect a future CB_RECALLABLE_OBJ_AVAIL operation to indicate that a directory delegation is available. If the server does not wish to honor the "want" or is not able to do so, it returns the error NFS4ERR_DIRDELEG_UNAVAIL. If the delegation is immediately - available, the server may return it with the response to the + available, the server SHOULD return it with the response to the operation, rather than via a callback. 18.39.4. IMPLEMENTATION - Directory delegation provides the benefit of improving cache + Directory delegations provide the benefit of improving cache consistency of namespace information. This is done through synchronous callbacks. A server must support synchronous callbacks in order to support directory delegations. In addition to that, asynchronous notifications provide a way to reduce network traffic as well as improve client performance in certain conditions. - Notifications would not be requested when the goal is just cache + Notifications should not be requested when the goal is just cache consistency. Notifications are specified in terms of potential changes to the directory. A client can ask to be notified of events by setting one or more bits in gdda_notification_types. The client can ask for notifications on addition of entries to a directory (by setting the NOTIFY4_ADD_ENTRY in gdda_notification_types), notifications on entry removal (NOTIFY4_REMOVE_ENTRY), renames (NOTIFY4_RENAME_ENTRY), directory attribute changes (NOTIFY4_CHANGE_DIR_ATTRIBUTES), and cookie verifier changes (NOTIFY4_CHANGE_COOKIE_VERIFIER) by setting @@ -23673,55 +23754,55 @@ The client can also ask for notifications of changes to attributes of directory entries (NOTIFY4_CHANGE_CHILD_ATTRIBUTES) in order to keep its attribute cache up to date. However any changes made to child attributes do not cause the delegation to be recalled. If a client is interested in directory entry caching, or negative name caching, it can set the gdda_notification_types appropriately to its particular need and the server will notify it of all changes that would otherwise invalidate its name cache. The kind of notification a client asks for may depend on the directory size, its rate of - change and the applications being used to access that directory. - However, the conditions under which a client might ask for a - notification, is out of the scope of this specification. + change and the applications being used to access that directory. The + enumeration of the conditions under which a client might ask for a + notification is out of the scope of this specification. For attribute notifications, the client will set bits in the gdda_dir_attributes bitmap to indicate which attributes it wants to be notified of. If the server does not support notifications for - changes to a certain attribute, it should not set that attribute in + changes to a certain attribute, it SHOULD NOT set that attribute in the supported attribute bitmap specified in the reply (gddr_dir_attributes). The client will also set in the gdda_child_attributes bitmap the attributes of directory entries it wants to be notified of, and the server will indicate in gddr_child_attributes which attributes of directory entries it will notify the client of. The client will also let the server know if it wants to get the notification as soon as the attribute change occurs or after a certain delay by setting a delay factor; gdda_child_attr_delay is for attribute changes to directory entries and gdda_dir_attr_delay is for attribute changes to the directory. If this delay factor is set to zero, that indicates to the server that the client wants to be notified of any attribute changes as soon as they occur. If the delay factor is set to N seconds, the server will make a best effort - guarantee that attribute updates are not out of sync by more than N - seconds. If the client asks for a delay factor that the server does - not support or that may cause significant resource consumption on the + guarantee that attribute updates are synchronized within N seconds. + If the client asks for a delay factor that the server does not + support or that may cause significant resource consumption on the server by causing the server to send a lot of notifications, the server should not commit to sending out notifications for attributes and therefore must not set the appropriate bit in the gddr_child_attributes and gddr_dir_attributes bitmaps in the response. - The client should use a security flavor that the file system is - exported with. If it uses a different flavor, the server should - return NFS4ERR_WRONGSEC to the operation that precedes + The client should use a security tuple that the file system is + exported with. If it uses a different tuple, the server should + return NFS4ERR_WRONGSEC to the operation that both precedes GET_DIR_DELEGATION and sets the current filehandle. The directory delegation covers all the entries in the directory except the parent entry. That means if a directory and its parent both hold directory delegations, any changes to the parent will not cause a notification to be sent for the child even though the child's parent entry points to the parent directory. 18.40. Operation 47: GETDEVICEINFO - Get Device Information @@ -23745,30 +23826,31 @@ case NFS4_OK: GETDEVICEINFO4resok gdir_resok4; case NFS4ERR_TOOSMALL: count4 gdir_mincount; default: void; }; 18.40.3. DESCRIPTION - Returns device address information for the specified device ID. The - client identifies the device information to be returned by providing - the gdia_device_id and gdia_layout_type that uniquely identify the - device address. The client provides gdia_maxcount to limit the - number of bytes for the result. This maximum size represents all of - the data being returned within the GETDEVICEINFO4resok structure and - includes the XDR overhead. The server may return less data. If the - server is unable to return the information within the gdia_maxcount - limit, the error NFS4ERR_TOOSMALL will be returned. However, if - gdia_maxcount is zero, NFS4ERR_TOOSMALL MUST NOT be returned. + Returns pNFS storage device address information for the specified + device ID. The client identifies the device information to be + returned by providing the gdia_device_id and gdia_layout_type that + uniquely identify the device. The client provides gdia_maxcount to + limit the number of bytes for the result. This maximum size + represents all of the data being returned within the + GETDEVICEINFO4resok structure and includes the XDR overhead. The + server may return less data. If the server is unable to return any + information within the gdia_maxcount limit, the error + NFS4ERR_TOOSMALL will be returned. However, if gdia_maxcount is + zero, NFS4ERR_TOOSMALL MUST NOT be returned. The da_layout_type field of the gdir_device_addr returned by the server MUST be equal to the gdia_layout_type specified by the client. If it is not equal, the client SHOULD ignore the response as invalid and behave as if the server returned an error, even if the client does have support for the layout type returned. The client also provides a notification bitmap, gdia_notify_types for the device ID mapping notification for which it is interested in receiving; the server must support device ID notifications for the @@ -23781,21 +23863,21 @@ (see Section 20.12). The notification bitmap applies only to the specified device ID. If a client issues GETDEVICEINFO on a deviceID multiple times, the last notification bitmap is used by the server for subsequent notifications. If the bitmap is zero or empty, then the device ID's notifications are turned off. If the client wants to just update or turn off notifications, it MAY issue GETDEVICEINFO with gdia_maxcount set to zero. In that event, - if the device ID is valid, the da_addr_body field of the + if the device ID is valid, the reply's da_addr_body field of the gdir_device_addr field will be of zero length. If an unknown device ID is given in gdia_device_id, the server returns NFS4ERR_NOENT. Otherwise, the device address information is returned in gdir_device_addr. Finally, if the server supports notifications for device ID mappings, the gdir_notification result will contain a bitmap of which notifications it will actually send to the client (via CB_NOTIFY_DEVICEID, see Section 20.12). If NFS4ERR_TOOSMALL is returned, the results also contain @@ -23823,32 +23905,32 @@ o CB_NOTIFY_DEVICEID deletes a device ID. If the client believes it has layouts that refer to the device ID, then it is possible the layouts have been revoked. The client should send a TEST_STATEID request using the stateid for each layout that might have been revoked. If TEST_STATEID indicates any layouts have been revoked, the client must recover from layout revocation as described in Section 12.5.6. If TEST_STATEID indicates at least one layout has not been revoked, the client should send a GETDEVICEINFO on the device ID to verify that the device ID has been deleted. If GETDEVICEINFO indicates the device ID does not exist, the client - then assumes the server is broken, and recovers issuing + then assumes the server is faulty, and recovers issuing by EXCHANGE_ID. If the client does not have layouts that refer to the device ID, no harm is done. The client should mark the device ID as deleted, and when the GETDEVICEINFO or GETDEVICELIST results are finally received for the device ID, delete the device ID from client's cache. o CB_NOTIFY_DEVICEID indicates a device ID's device addressing mappings have changed. The client should assume that the results from the in progress GETDEVICEINFO will be stale for the device ID - once received, and so it should send a GETDEVICEINFO on the device - ID. + once received, and so it should send another GETDEVICEINFO on the + device ID. 18.41. Operation 48: GETDEVICELIST - Get All Device Mappings for a File System 18.41.1. ARGUMENT struct GETDEVICELIST4args { /* CURRENT_FH: object belonging to the file system */ layouttype4 gdla_layout_type; @@ -23876,26 +23958,30 @@ }; 18.41.3. DESCRIPTION This operation is used by the client to enumerate all of the device IDs a server's file system uses. The client provides a current filehandle of a file object that belongs to the file system (i.e. all file objects sharing the same fsid as that of the current filehandle), and the layout type in - gdia_layout_type. Since this operation may require multiple calls to - enumerate all the device IDs (and is thus similar to the READDIR + gdia_layout_type. Since this operation might require multiple calls + to enumerate all the device IDs (and is thus similar to the READDIR (Section 18.23) operation), the client also provides gdia_cookie and gdia_cookieverf to specify the current cursor position in the list. - The client provides gdla_maxdevices to limit the number of device IDs - in the result. The server MAY return fewer device IDs. + When the client wants to read from the beginning of the file system's + device mappings, it sets gdla_cookie to zero. The field + gdla_cookieverf MUST be ignored by the server when gdla_cookie is + zero. The client provides gdla_maxdevices to limit the number of + device IDs in the result. If gdla_maxdevices is zero, the server + MUST return NFS4ERR_INVAL. The server MAY return fewer device IDs. The successful response to the operation will contain the cookie, gdlr_cookie, and cookie verifier, gdlr_cookieverf, to be used on the subsequent GETDEVICELIST. A gdlr_eof value of TRUE signifies that there are no remaining entries in the server's device list. Each element of gdlr_deviceid_list contains a device ID. 18.41.4. IMPLEMENTATION An example of the use of this operation is for pNFS clients and @@ -23951,26 +24037,26 @@ default: void; }; 18.42.3. DESCRIPTION Commits changes in the layout represented by the current filehandle, client ID (derived from the sessionid in the preceding SEQUENCE operation), byte range, and stateid. Since layouts are sub- dividable, a smaller portion of a layout, retrieved via LAYOUTGET, - may be committed. The region being committed is specified through + can be committed. The region being committed is specified through the byte range (loca_offset and loca_length). This region MUST overlap with one or more existing layouts previously granted via LAYOUTGET (Section 18.43), each with an iomode of LAYOUTIOMODE4_RW. - In the case where any held layout segments iomode is not - LAYOUTIOMODE4_RW the server should return the error + In the case where the iomode of any held layout segment is not + LAYOUTIOMODE4_RW, the server should return the error NFS4ERR_BAD_IOMODE. For the case where the client does not hold matching layout segment(s) for the defined region, the server should return the error NFS4ERR_BAD_LAYOUT. The LAYOUTCOMMIT operation indicates that the client has completed writes using a layout obtained by a previous LAYOUTGET. The client may have only written a subset of the data range it previously requested. LAYOUTCOMMIT allows it to commit or discard provisionally allocated space and to update the server with a new end of file. The layout referenced by LAYOUTCOMMIT is still valid after the operation @@ -24020,21 +24106,21 @@ result of the LAYOUTCOMMIT operation, it must return the new size (locr_newsize.ns_size) as part of the results. The loca_time_modify field allows the client to suggest a modification time it would like the metadata server to set. The metadata server may use the suggestion or it may use the time of the LAYOUTCOMMIT operation to set the modification time. If the metadata server uses the client provided modification time, it should ensure time does not flow backwards. If the client wants to force the metadata server to set an exact time, the client should use a SETATTR - operation in a compound right after LAYOUTCOMMIT. See Section 12.5.4 + operation in a COMPOUND right after LAYOUTCOMMIT. See Section 12.5.4 for more details. If the client desires the resultant modification time it should construct the COMPOUND so that a GETATTR follows the LAYOUTCOMMIT. The loca_layoutupdate argument to LAYOUTCOMMIT provides a mechanism for a client to provide layout specific updates to the metadata server. For example, the layout update can describe what regions of the original layout have been used and what regions can be deallocated. There is no NFSv4.1 file layout-specific layoutupdate4 structure. @@ -24098,39 +24184,39 @@ case NFS4_OK: LAYOUTGET4resok logr_resok4; case NFS4ERR_LAYOUTTRYLATER: bool logr_will_signal_layout_avail; default: void; }; 18.43.3. DESCRIPTION - Requests a layout from the metadata server for reading or writing - (and reading) the file given by the filehandle at the byte range - specified by offset and length. Layouts are identified by the client - ID (derived from the sessionid in the preceding SEQUENCE operation), - current filehandle, layout type (loga_layout_type), and the layout - stateid (loga_stateid). The use of the loga_iomode depends upon the + Requests a layout from the metadata server for reading or writing the + file given by the filehandle at the byte range specified by offset + and length. Layouts are identified by the client ID (derived from + the sessionid in the preceding SEQUENCE operation), current + filehandle, layout type (loga_layout_type), and the layout stateid + (loga_stateid). The use of the loga_iomode field depends upon the layout type, but should reflect the client's data access intent. If the metadata server is in a grace period, and does not persist layouts and device ID to device address mappings, then it MUST return NFS4ERR_GRACE (see Section 8.4.2.1). The LAYOUTGET operation returns layout information for the specified byte range: a layout. To get a layout from a specific offset through the end-of-file, regardless of the file's length, a loga_length field - with all bits set to 1 (one) should be used. If loga_length is zero, - or if a loga_length which is not all bits set to one is specified, - and loga_length when added to loga_offset exceeds the maximum 64-bit - unsigned integer value, the error NFS4ERR_INVAL will result. + set to NFS4_UINT64_MAX is used. If loga_length is zero, or if a + loga_length which is not NFS4_UINT64_MAX is specified, and the sum of + loga_length and loga_offset exceeds NFS4_UINT64_MAX, the error + NFS4ERR_INVAL will result. The loga_minlength field specifies the minimum length of layout the server MUST return with two exceptions: 1. The argument loga_iomode was set to LAYOUTIOMODE_READ, and loga_offset plus loga_minlength goes past the end of the file. 2. The range from loga_offset through loga_offset + loga_minlength - 1 overlaps two or more striping patterns. In which case, logr_layout will contain two or more elements, and the sum of the @@ -24178,24 +24264,24 @@ The logr_return_on_close result field is a directive to return the layout before closing the file. When the server sets this return value to TRUE, it MUST be prepared to recall the layout in the case the client fails to return the layout before close. For the server that knows a layout must be returned before a close of the file, this return value can be used to communicate the desired behavior to the client and thus remove one extra step from the client's and server's interaction. - The logr_stateid, as with all stateid processing, is returned to the - client for use in subsequent layout related operations. See - Section 8.2, Section 12.5.3, and Section 12.5.5.2 for a further - discussion and requirements. + The logr_stateid stateid is returned to the client for use in + subsequent layout related operations. See Section 8.2, + Section 12.5.3, and Section 12.5.5.2 for a further discussion and + requirements. The format of the returned layout (lo_content) is specific to the layout type. The value of the layout type (lo_content.loc_type) for each of the elements of the array of layouts returned by the server (logr_layout) MUST be equal to the loga_layout_type specified by the client. If it is not equal, the client SHOULD ignore the response as invalid and behave as if the server returned an error, even if the client does have support for the layout type returned. If layouts are not supported for the requested file or its containing @@ -24226,26 +24312,29 @@ supports and will honor the "want", the results will have logr_will_signal_layout_avail set to TRUE. If so the client should expect a CB_RECALLABLE_OBJ_AVAIL operation to indicate that a layout is available. On success, the current filehandle retains its value and the current stateid is updated to match the value as returned in the results. 18.43.4. IMPLEMENTATION - Typically, LAYOUTGET will be called as part of a compound RPC after - an OPEN operation and results in the client having location - information for the file; a client may also hold a layout across - multiple OPENs. The client specifies a layout type that limits what - kind of layout the server will return. This prevents servers from - issuing layouts that are unusable by the client. + Typically, LAYOUTGET will be called as part of a COMPOUND request + after an OPEN operation and results in the client having location + information for the file; this requires that loga_stateid be set to + the special stateid that tells the server to use the current stateid, + which is set by OPEN (see Section 16.2.3.1.2) . A client may also + hold a layout across multiple OPENs. The client specifies a layout + type that limits what kind of layout the server will return. This + prevents servers from issuing layouts that are unusable by the + client. Once the client has obtained a layout referring to a particular device ID, the server MUST NOT delete the device ID until the layout is returned or revoked. CB_NOTIFY_DEVICEID can race with LAYOUTGET. One race scenario is that LAYOUTGET returns a device ID the client does not have device address mappings for, and the server sends a CB_NOTIFY_DEVICEID to add the device ID to the client's awareness and meanwhile the client sends GETDEVICEINFO on the device ID. This scenario is discussed in @@ -24308,30 +24397,31 @@ union LAYOUTRETURN4res switch (nfsstat4 lorr_status) { case NFS4_OK: layoutreturn_stateid lorr_stateid; default: void; }; 18.44.3. DESCRIPTION - This operation returns one or more layouts represented by the client - ID (derived from the sessionid in the preceding SEQUENCE operation), - lora_layout_type, and lora_iomode. When lr_returntype is - LAYOUTRETURN4_FILE, the returned layout is further identified by the - current filehandle, lrf_offset, lrf_length, and lrf_stateid. If the - lrf_length is all 1s, all bytes of the layout, starting at lrf_offset - are returned. When lr_returntype is LAYOUTRETURN4_FSID the current - filehandle is used to identify the file system and all layouts - matching the client ID, lora_layout_type, and lora_iomode are - returned. When lr_returntype is LAYOUTRETURN4_ALL all layouts + This operation returns from the client to the server one or more + layouts represented by the client ID (derived from the sessionid in + the preceding SEQUENCE operation), lora_layout_type, and lora_iomode. + When lr_returntype is LAYOUTRETURN4_FILE, the returned layout is + further identified by the current filehandle, lrf_offset, lrf_length, + and lrf_stateid. If the lrf_length field is NFS4_UINT64_MAX, all + bytes of the layout, starting at lrf_offset are returned. When + lr_returntype is LAYOUTRETURN4_FSID, the current filehandle is used + to identify the file system and all layouts matching the client ID, + the fsid of the file system, lora_layout_type, and lora_iomode are + returned. When lr_returntype is LAYOUTRETURN4_ALL, all layouts matching the client ID, lora_layout_type, and lora_iomode are returned and the current filehandle is not used. After this call, the client MUST NOT use the returned layout(s) and the associated storage protocol to access the file data. If the set of layouts designated in the case of LAYOUTRETURN4_FSID or LAYOUTRETURN4_ALL is empty, then no error results. In the case of LAYOUTRETURN4_FILE, the byte range specified is returned even if it is a subdivision of a layout previously obtained with LAYOUTGET, a combination of multiple layouts previously obtained with LAYOUTGET, @@ -24358,94 +24448,97 @@ LAYOUTRETURN4_ALL) are being returned. In the case that lr_returntype is LAYOUTRETURN4_FILE, the lrf_stateid provided by the client is a layout stateid as returned from previous layout operations. Note that the "seqid" field of lrf_stateid MUST NOT be zero. See Section 8.2, Section 12.5.3, and Section 12.5.5.2 for a further discussion and requirements. Return of a layout or all layouts does not invalidate the mapping of storage device ID to storage device address which remains in effect - until specifically recalled or changed via notification callbacks. + until specifically changed or deleted via device ID notification + callbacks. - The lora_reclaim field set to TRUE in a LAYOUTRETURN request - specifies that the client is attempting to return a layout that was - acquired before the restart of the metadata server during the - metadata server's grace period. When returning layouts that were - acquired during the metadata server's grace period MUST set the - lora_reclaim field to FALSE. The lora_reclaim field MUST be set to - FALSE also when lr_layoutreturn is LAYOUTRETURN4_FSID or - LAYOUTRETURN4_ALL. See LAYOUTCOMMIT (Section 18.42) for more - details. + If the lora_reclaim field is set to TRUE, the client is attempting to + return a layout that was acquired before the restart of the metadata + server during the metadata server's grace period. When returning + layouts that were acquired during the metadata server's grace period, + the client MUST set the lora_reclaim field to FALSE. The + lora_reclaim field MUST be set to FALSE also when lr_layoutreturn is + LAYOUTRETURN4_FSID or LAYOUTRETURN4_ALL. See LAYOUTCOMMIT + (Section 18.42) for more details. Layouts may be returned when recalled or voluntarily (i.e., before the server has recalled them). In either case the client must properly propagate state changed under the context of the layout to the storage device(s) or to the metadata server before returning the layout. - If the client is returning the layout in response to a - CB_LAYOUTRECALL where the lor_recalltype was LAYOUTRECALL4_FILE, the - client should use the lor_stateid value from CB_LAYOUTRECALL as the - value for lrf_stateid. Otherwise, it should use logr_stateid (from a - previous LAYOUTGET result) or lorr_stateid (from a previous LAYRETURN - result). This is done to indicate the point in time (in terms of - layout stateid transitions) when the recall was sent. The client - must use the precise lora_recallstateid value and not set the seqid - to zero. Otherwise NFS4ERR_BAD_STATEID will be returned. - - NFS4ERR_OLD_STATEID can be returned if the client is using an old - seqid, and the server knows the client should not be using the old - seqid. E.g. the client uses the seqid on slot 1 of the session, - received the response with the new seqid, and uses the slot to send - another request with the old seqid. + If the client returns the layout in response to a CB_LAYOUTRECALL + where the lor_recalltype field of the clora_recall field was + LAYOUTRECALL4_FILE, the client should use the lor_stateid value from + CB_LAYOUTRECALL as the value for lrf_stateid. Otherwise, it should + use logr_stateid (from a previous LAYOUTGET result) or lorr_stateid + (from a previous LAYRETURN result). This is done to indicate the + point in time (in terms of layout stateid transitions) when the + recall was sent. The client uses the precise lora_recallstateid + value and MUST NOT set the stateid's seqid to zero; otherwise + NFS4ERR_BAD_STATEID MUST be returned. NFS4ERR_OLD_STATEID can be + returned if the client is using an old seqid, and the server knows + the client should not be using the old seqid. E.g. the client uses + the seqid on slot 1 of the session, received the response with the + new seqid, and uses the slot to send another request with the old + seqid. If a client fails to return a layout in a timely manner, then the - metadata server should use its control protocol with the storage + metadata server SHOULD use its control protocol with the storage devices to fence the client from accessing the data referenced by the layout. See Section 12.5.5 for more details. If the LAYOUTRETURN request sets the lora_reclaim field to TRUE after the metadata server's grace period, NFS4ERR_NO_GRACE is returned. If the LAYOUTRETURN request sets the lora_reclaim field to TRUE and lr_returntype is set to LAYOUTRETURN4_FSID or LAYOUTRETURN4_ALL, NFS4ERR_INVAL is returned. - If the operation specified lr_returntype of LAYOUTRETURN4_FILE, then - lrs_stateid will represent the layout stateid as updated for this - operation's processing; the current stateid will also be updated to - match the returned value. If the last byte of any layout for the - current file, client ID, and layout type is being returned and there - are no remaining pending CB_LAYOUTRECALL operations for which a - LAYOUTRETURN operation must be done as a completing operation, - lrs_present MUST be FALSE, and thus no stateid will be returned. + If the operation set the lr_returntype field to LAYOUTRETURN4_FILE, + then the lrs_stateid field will represent the layout stateid as + updated for this operation's processing; the current stateid will + also be updated to match the returned value. If the last byte of any + layout for the current file, client ID, and layout type is being + returned and there are no remaining pending CB_LAYOUTRECALL + operations for which a LAYOUTRETURN operation must be done, + lrs_present MUST be FALSE, and thus no stateid will be returned and + the current stateid will be cleared. On success, the current filehandle retains its value. - The server MAY require that the principal, security flavor, and if - applicable, the GSS mechanism, combination that acquired the layout - also be the one to send LAYOUTRETURN. This might not be possible if - credentials for the principal are no longer available. The server - MAY allow the machine credential or SSV credential (see - Section 18.35) to send LAYOUTRETURN. + If the EXCHGID4_FLAG_BIND_PRINC_STATEID capability is set on the + client ID (see Section 18.35), the server will require that the + principal, security flavor, and if applicable, the GSS mechanism, + combination that acquired the layout also be the one to send + LAYOUTRETURN. This might not be possible if credentials for the + principal are no longer available. The server will allow the machine + credential or SSV credential (see Section 18.35) to send LAYOUTRETURN + if LAYOUTRETURN's operation code was set in the spo_must_allow result + of EXCHANGE_ID. 18.44.4. IMPLEMENTATION The final LAYOUTRETURN operation in response to a CB_LAYOUTRECALL callback MUST be serialized with any outstanding, intersecting LAYOUTRETURN operations. Note that it is possible that while a client is returning the layout for some recalled range the server may recall a superset of that range (e.g. LAYOUTRECALL4_ALL); the final return operation for the latter must block until the former layout - recall is done - when its corresponding final return operation is - replied. + recall is done. Returning all layouts in a file system using LAYOUTRETURN4_FSID is typically done in response to a CB_LAYOUTRECALL for that file system as the final return operation. Similarly, LAYOUTRETURN4_ALL is used in response to a recall callback for all layouts. It is possible that the client already returned some outstanding layouts via individual LAYOUTRETURN calls and the call for LAYOUTRETURN4_FSID or LAYOUTRETURN4_ALL marks the end of the LAYOUTRETURN sequence. See Section 12.5.5.1 for more details. @@ -24563,21 +24655,21 @@ 18.46.3. DESCRIPTION The SEQUENCE operation is used by the server to implement session request control and the reply cache semantics. This operation MUST appear as the first operation of any COMPOUND in which it appears. The error NFS4ERR_SEQUENCE_POS will be returned when it is found in any position in a COMPOUND beyond the first. Operations other than SEQUENCE, BIND_CONN_TO_SESSION, EXCHANGE_ID, - CREATE_SESSION, and DESTROY_SESSION, may not appear as the first + CREATE_SESSION, and DESTROY_SESSION, MUST NOT appear as the first operation in a COMPOUND. Such operations MUST yield the error NFS4ERR_OP_NOT_IN_SESSION if they do appear at the start of a COMPOUND. If SEQUENCE is received on a connection not associated with the session via CREATE_SESSION or BIND_CONN_TO_SESSION, and connection association enforcement is enabled (see Section 18.35), then the server returns NFS4ERR_CONN_NOT_BOUND_TO_SESSION. The sa_sessionid argument identifies the session this request applies @@ -24716,35 +24808,36 @@ SEQ4_STATUS_DEVID_CHANGED The client is using device ID notifications and the server has changed a device ID mapping held by the client. This flag will stay present until the client has obtained the new mapping with GETDEVICEINFO. SEQ4_STATUS_DEVID_DELETED The client is using device ID notifications and the server has deleted a device ID mapping held by the client. This flag will - stay in affect until the client sends GETDEVICEINFO with a null - value in the argument gdia_notify_types. + stay in effect until the client sends a GETDEVICEINFO on the + device ID with a null value in the argument gdia_notify_types. - The value of sa_sequenceid argument relative to the cached sequence - id on the slot falls into one of three cases. + The value of the sa_sequenceid argument relative to the cached + sequence id on the slot falls into one of three cases. o If the difference between sa_sequenceid and the server's cached sequence id at the slot id is two (2) or more, or if sa_sequenceid is less than the cached sequence id (accounting for wraparound of the unsigned sequence id value), then the server MUST return NFS4ERR_SEQ_MISORDERED. o If sa_sequenceid and the cached sequence id are the same, this is a retry, and the server replies with the COMPOUND reply that is - stored the reply cache. The lease is possibly renewed. + stored the reply cache. The lease is possibly renewed as + described below. o If sa_sequenceid is one greater (accounting for wraparound) than the cached sequence id, then this is a new request, and the slot's sequence id is incremented. The operations subsequent to SEQUENCE, if any, are processed. If there are no other operations, the only other effects are to cache the SEQUENCE reply in the slot, maintain the session's activity, and possibly renew the lease. If the client reuses a slot id and sequence id for a completely @@ -24818,64 +24911,64 @@ This operation is used to update the SSV for a client ID. Before SET_SSV is called the first time on a client ID, the SSV is zero (0). The SSV is the key used for the SSV GSS mechanism (Section 2.10.8) SET_SSV MUST be preceded by a SEQUENCE operation in the same COMPOUND. It MUST NOT be used if the client did not opt for SP4_SSV state protection when the client ID was created (see Section 18.35); the server returns NFS4ERR_INVAL in that case. - ssa_digest is computed as the output of the HMAC RFC2104 [11] using - the subkey derived from the SSV4_SUBKEY_MIC_I2T and current SSV as - the key (See Section 2.10.8 for a description of subkeys), and an XDR - encoded value of data type ssa_digest_input4. The field sdi_seqargs - is equal to the arguments of the SEQUENCE operation for the COMPOUND - procedure that SET_SSV is within. + The field ssa_digest is computed as the output of the HMAC RFC2104 + [11] using the subkey derived from the SSV4_SUBKEY_MIC_I2T and + current SSV as the key (See Section 2.10.8 for a description of + subkeys), and an XDR encoded value of data type ssa_digest_input4. + The field sdi_seqargs is equal to the arguments of the SEQUENCE + operation for the COMPOUND procedure that SET_SSV is within. The argument ssa_ssv is XORed with the current SSV to produce the new SSV. The argument ssa_ssv SHOULD be generated randomly. In the response, ssr_digest is the output of the HMAC using the subkey derived from SSV4_SUBKEY_MIC_T2I and new SSV as the key, and an XDR encoded value of data type ssr_digest_input4. The field sdi_seqres is equal to the results of the SEQUENCE operation for the COMPOUND procedure that SET_SSV is within. As noted in Section 18.35, the client and server can maintain multiple concurrent versions of the SSV. The client and server each MUST maintain an internal SSV version number, which is set to one (1) the first time SET_SSV executes on the server and the client receives the first SET_SSV reply. Each subsequent SET_SSV increases the - internal counter by one (1). The value of this version number - corresponds to the smpt_ssv_seq, smt_ssv_seq, sspt_ssv_seq, and - ssct_ssv_seq fields for the SSV GSS mechanism tokens (see + internal SSV version number by one (1). The value of this version + number corresponds to the smpt_ssv_seq, smt_ssv_seq, sspt_ssv_seq, + and ssct_ssv_seq fields of the SSV GSS mechanism tokens (see Section 2.10.8). 18.47.4. IMPLEMENTATION When the server receives ssa_digest, it MUST verify the digest by computing the digest the same way the client did and comparing it with ssa_digest. If the server gets a different result, this is an error, NFS4ERR_BAD_SESSION_DIGEST. This error might be the result of another SET_SSV from the same client ID changing the SSV. If so, the client recovers by issuing SET_SSV again with a recomputed digest based on the subkey of the new SSV. If the transport connection is dropped after the SET_SSV request is sent, but before the SET_SSV reply is received, then there are special considerations for recovery if the client has no more connections associated with sessions associated with the client ID of the SSV. See Section 18.34.4. Clients SHOULD NOT send an ssa_ssv that is equal to a previous - ssa_ssv, nor equal to a previous SSV (including an ssa_ssv equal to - zero since the SSV is initialized to zero when the client ID is - created). + ssa_ssv, nor equal to a previous or current SSV (including an ssa_ssv + equal to zero since the SSV is initialized to zero when the client ID + is created). Clients SHOULD send SET_SSV with RPCSEC_GSS privacy. Servers MUST support RPCSEC_GSS with privacy for any COMPOUND that has { SEQUENCE, SET_SSV }. A client SHOULD NOT send SET_SSV with the SSV GSS mechanism's credential because the purpose of SET_SSV is to seed the SSV from non-SSV credentials. Instead SET_SSV SHOULD be sent with the credential of a user that is accessing the client ID for the first time (Section 2.10.7.3). However if the client does send SET_SSV @@ -25002,39 +25095,40 @@ default: void; }; 18.49.3. DESCRIPTION Where this description mandates the return of a specific error code for a specific condition, and where multiple conditions apply, the server MAY return any of the mandated error codes. - This operation allows a client to + The server MAY support this operation. If the server does not + support this operation, it MUST return NFS4ERR_NOTSUPP. - o get a delegation on all types of files except directories. The - server MAY support this operation. If the server does not support - this operation, it MUST return NFS4ERR_NOTSUPP. + This operation allows a client to: - o register a "want" for a delegation for the specified file object, + o Get a delegation on all types of files except directories. + + o Register a "want" for a delegation for the specified file object, and be notified via a callback when the delegation is available. The server MAY support notifications of availability via callbacks. If the server does not support registration of wants it MUST NOT return an error to indicate that, and instead MUST - return ond_why set to WND4_CONTENTION or WND4_RESOURCE and + return with ond_why set to WND4_CONTENTION or WND4_RESOURCE and ond_server_will_push_deleg or ond_server_will_signal_avail set to FALSE. When the server indicates that it will notify the client by means of a callback, it will either provide the delegation using a CB_PUSH_DELEG operation, or cancel its promise by sending a CB_WANTS_CANCELLED operation. - o cancel a want for a delegation. + o Cancel a want for a delegation. The client SHOULD NOT set OPEN4_SHARE_ACCESS_READ and SHOULD NOT set OPEN4_SHARE_ACCESS_WRITE in wda_want. If it does, the server MUST ignore them. The meanings of the following flags in wda_want are the same as they are in OPEN: o OPEN4_SHARE_ACCESS_WANT_READ_DELEG @@ -25042,29 +25136,28 @@ o OPEN4_SHARE_ACCESS_WANT_ANY_DELEG o OPEN4_SHARE_ACCESS_WANT_NO_DELEG o OPEN4_SHARE_ACCESS_WANT_CANCEL o OPEN4_SHARE_ACCESS_WANT_SIGNAL_DELEG_WHEN_RESRC_AVAIL o OPEN4_SHARE_ACCESS_WANT_PUSH_DELEG_WHEN_UNCONTENDED - The handling of the above flags in WANT_DELEGATION is the same as in OPEN. Information about the delegation and/or the promises the server is making regarding future callbacks are the same as those described in the open_delegation4 structure. The successful results of WANT_DELEG are of type open_delegation4 which is the same type as the "delegation" field in the results of - the OPEN operation. (See Section 18.16.3). The server constructs + the OPEN operation (see Section 18.16.3). The server constructs wdr_resok4 the same way it constructs OPEN's "delegation" with one difference: WANT_DELEGATION MUST NOT return a delegation type of OPEN_DELEGATE_NONE. If (wda_want & OPEN4_SHARE_ACCESS_WANT_DELEG_MASK) is zero then the client is indicating no desire for a delegation and the server MUST return NFS4ERR_INVAL. The client uses the OPEN4_SHARE_ACCESS_WANT_NO_DELEG flag in the WANT_DELEGATION operation to cancel a previously requested want for a @@ -25113,22 +25206,22 @@ 18.50.2. RESULT struct DESTROY_CLIENTID4res { nfsstat4 dcr_status; }; 18.50.3. DESCRIPTION The DESTROY_CLIENTID operation destroys the client ID. If there are sessions (both idle and non-idle), opens, locks, delegations, - layouts, and wants (Section 18.49) associated with the unexpired - lease of the client ID the server MUST return NFS4ERR_CLIENTID_BUSY. + layouts, and/or wants (Section 18.49) associated with the unexpired + lease of the client ID, the server MUST return NFS4ERR_CLIENTID_BUSY. DESTROY_CLIENTID MAY be preceded with a SEQUENCE operation as long as the client ID derived from the sessionid of SEQUENCE is not the same as the client ID to be destroyed. If the client IDs are the same, then the server MUST return NFS4ERR_CLIENTID_BUSY. If DESTROY_CLIENTID is not prefixed by SEQUENCE, it MUST be the only operation in the COMPOUND request (otherwise the server MUST return NFS4ERR_NOT_ONLY_OP). If the operation is sent without a SEQUENCE preceding it, a client that retransmits the request may receive an error in response, because the original request might have been @@ -25174,78 +25267,78 @@ RECLAIM_COMPLETE operations: o When rca_one_fs is FALSE, a global RECLAIM_COMPLETE is being done. This indicates that recovery of all locks that the client held on the previous server instance have been completed. o When rca_one_fs is TRUE, a file system-specific RECLAIM_COMPLETE is being done. This indicates that recovery of locks for a single fs (the one designated by the current filehandle) due to a file system transition have been completed. Presence of a current - filehandle is only required when rca_one_fs is true. + filehandle is only required when rca_one_fs is set to TRUE. Once a RECLAIM_COMPLETE is done, there can be no further reclaim operations for locks whose scope is defined as having completed recovery. Once the client sends RECLAIM_COMPLETE, the server will not allow the client to do subsequent reclaims of locking state for - that scope and will return NFS4ERR_NO_GRACE, if these are attempted. + that scope and if these are attempted, will return NFS4ERR_NO_GRACE. Whenever a client establishes a new client ID and before it does the first non-reclaim operation that obtains a lock, it MUST do a global RECLAIM_COMPLETE, even if there are no locks to reclaim. If non- reclaim locking operations are done before the RECLAIM_COMPLETE, a - NFS4ERR_GRACE will be returned. + NFS4ERR_GRACE error will be returned. Similarly, when the client accesses a file system on a new server, before it sends the first non-reclaim operation that obtains a lock - on this new server, it must do a RECLAIM_COMPLETE with rca_one_fs - true and current filehandle within that file system, even if there + on this new server, it must do a RECLAIM_COMPLETE with rca_one_fs set + to TRUE and current filehandle within that file system, even if there are no locks to reclaim. If non-reclaim locking operations are done on that file system before the RECLAIM_COMPLETE, a NFS4ERR_GRACE will be returned. Any locks not reclaimed at the point at which RECLAIM_COMPLETE is done become non-reclaimable. The client MUST NOT attempt to reclaim them, either during the current server instance or in any subsequent server instance, or on another server to which responsibility for that file system is transferred. If the client were to do so, it would be violating the protocol by representing itself as owning locks that it does not own, and so has no right to reclaim. See Section 8.4.3 for a discussion of edge conditions related to lock reclaim. - Once the client has done a RECLAIM_COMPLETE, it indicates readiness - to proceed to do normal non-reclaim locking operations. The client + By sending a RECLAIM_COMPLETE, the client indicates readiness to + proceed to do normal non-reclaim locking operations. The client should be aware that such operations may temporarily result in NFS4ERR_GRACE errors until the server is ready to terminate its grace period. 18.51.4. IMPLEMENTATION Servers will typically use the information as to when reclaim activity is complete to reduce the length of the grace period. When - the server maintains a list of clients that may have locks in - persistent storage, it is in a position to use the fact that all such - clients have done a RECLAIM_COMPLETE to terminate the grace period - and begin normal operations (i.e. grant requests for new locks) - sooner than it might otherwise. + the server maintains in persistent storage a list of clients that + might have had locks, it is in a position to use the fact that all + such clients have done a RECLAIM_COMPLETE to terminate the grace + period and begin normal operations (i.e. grant requests for new + locks) sooner than it might otherwise. Latency can be minimized by doing a RECLAIM_COMPLETE as part of the COMPOUND request in which the last lock-reclaiming operation is done. When there are no reclaims to be done, RECLAIM_COMPLETE should be done immediately in order to allow the grace period to end as soon as possible. RECLAIM_COMPLETE should only be done once for each server instance, or occasion of the transition of a file system. If it is done a - second time, an NFS4ERR_COMPLETE_ALREADY will result. Note that - because of the session feature's retry protection, retries of + second time, the error NFS4ERR_COMPLETE_ALREADY will result. Note + that because of the session feature's retry protection, retries of COMPOUND requests containing RECLAIM_COMPLETE operation will not result in this error. When a RECLAIM_COMPLETE is done, the client effectively acknowledges any locks not yet reclaimed as lost. This allows the server to again mark this client as able to subsequently recover locks if it had been prevented from doing so, be by logic to prevent the occurrence of edge conditions, as described in Section 8.4.3. 18.52. Operation 10044: ILLEGAL - Illegal operation @@ -26472,22 +26565,22 @@ * CB_ILLEGAL: Response for illegal operation numbers */ struct CB_ILLEGAL4res { nfsstat4 status; }; 20.13.3. DESCRIPTION This operation is a placeholder for encoding a result to handle the case of the client sending an operation code within COMPOUND that is - not supported. See the COMPOUND procedure description for more - details. + not defined in the NFSv4.1 specification. See Section 16.2.3 for + more details. The status field of CB_ILLEGAL4res MUST be set to NFS4ERR_OP_ILLEGAL. 20.13.4. IMPLEMENTATION A server will probably not send an operation with code OP_CB_ILLEGAL but if it does, the response will be CB_ILLEGAL4res just as it would be with any other invalid operation code. Note that if the client gets an illegal operation code that is not OP_ILLEGAL, and if the client checks for legal operation codes during the XDR decode phase,