idnits 2.17.00 (12 Aug 2021) /tmp/idnits11445/draft-ietf-nfsv4-rfc5667bis-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 3, 2017) is 1932 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: draft-ietf-nfsv4-rfc5666bis has been published as RFC 8166 == Outdated reference: draft-ietf-nfsv4-rpcrdma-bidirection has been published as RFC 8167 ** Obsolete normative reference: RFC 5661 (Obsoleted by RFC 8881) == Outdated reference: draft-ietf-nfsv4-versioning has been published as RFC 8178 -- Obsolete informational reference (is this intentional?): RFC 5667 (Obsoleted by RFC 8267) Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network File System Version 4 C. Lever, Ed. 3 Internet-Draft Oracle 4 Obsoletes: 5667 (if approved) February 3, 2017 5 Intended status: Standards Track 6 Expires: August 7, 2017 8 Network File System (NFS) Upper Layer Binding To RPC-Over-RDMA 9 draft-ietf-nfsv4-rfc5667bis-05 11 Abstract 13 This document specifies Upper Layer Bindings of Network File System 14 (NFS) protocol versions to RPC-over-RDMA. Upper Layer Bindings are 15 required to enable RPC-based protocols, such as NFS, to use Direct 16 Data Placement on RPC-over-RDMA. This document obsoletes RFC 5667. 18 Requirements Language 20 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 21 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 22 document are to be interpreted as described in [RFC2119]. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on August 7, 2017. 41 Copyright Notice 43 Copyright (c) 2017 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 This document may contain material from IETF Documents or IETF 57 Contributions published or made publicly available before November 58 10, 2008. The person(s) controlling the copyright in some of this 59 material may not have granted the IETF Trust the right to allow 60 modifications of such material outside the IETF Standards Process. 61 Without obtaining an adequate license from the person(s) controlling 62 the copyright in such materials, this document may not be modified 63 outside the IETF Standards Process, and derivative works of it may 64 not be created outside the IETF Standards Process, except to format 65 it for publication as an RFC or to translate it into languages other 66 than English. 68 Table of Contents 70 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 71 2. Conveying NFS Operations On RPC-Over-RDMA . . . . . . . . . . 3 72 3. Upper Layer Binding For NFS Versions 2 And 3 . . . . . . . . 4 73 4. Upper Layer Binding For NFS Version 4 . . . . . . . . . . . . 6 74 5. Extending NFS Upper Layer Bindings . . . . . . . . . . . . . 12 75 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 76 7. Security Considerations . . . . . . . . . . . . . . . . . . . 13 77 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 78 Appendix A. Changes Since RFC 5667 . . . . . . . . . . . . . . . 15 79 Appendix B. Acknowledgments . . . . . . . . . . . . . . . . . . 17 80 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 17 82 1. Introduction 84 An RPC-over-RDMA transport, such as the one defined in 85 [I-D.ietf-nfsv4-rfc5666bis], may employ direct data placement to 86 convey data payloads associated with RPC transactions. To enable 87 successful interoperation, RPC client and server implementations must 88 agree as to which XDR data items in what particular RPC procedures 89 are eligible for direct data placement (DDP). 91 This document contains material required of Upper Layer Bindings, as 92 specified in [I-D.ietf-nfsv4-rfc5666bis], for the following NFS 93 protocol versions: 95 o NFS Version 2 [RFC1094] 96 o NFS Version 3 [RFC1813] 98 o NFS Version 4.0 [RFC7530] 100 o NFS Version 4.1 [RFC5661] 102 o NFS Version 4.2 [RFC7862] 104 Upper Layer Bindings specified in this document apply to all versions 105 of RPC-over-RDMA. 107 2. Conveying NFS Operations On RPC-Over-RDMA 109 Definitions of terminology and a general discussion of how RPC-over- 110 RDMA is used to convey RPC transactions can be found in 111 [I-D.ietf-nfsv4-rfc5666bis]. In this section, these general 112 principles are applied in the context of conveying NFS procedures on 113 RPC-over-RDMA. Some issues common to all NFS protocol versions are 114 introduced. 116 2.1. DDP Eligibility Violations 118 To report a DDP-eligibity violation, an NFS server MUST return one 119 of: 121 o An RPC-over-RDMA message of type RDMA_ERROR, with the rdma_xid 122 field set to the XID of the matching NFS Call, and the rdma_error 123 field set to ERR_CHUNK; or 125 o An RPC message (via an RDMA_MSG message) with the xid field set to 126 the XID of the matching NFS Call, the mtype field set to REPLY, 127 the stat field set to MSG_ACCEPTED, and the accept_stat field set 128 to GARBAGE_ARGS. 130 Subsequent sections of this document describe further considerations 131 particular to specific NFS protocols or procedures. 133 2.2. Reply Size Estimation 135 During the construction of each RPC Call message, an NFS client is 136 responsible for allocating appropriate resources for receiving the 137 matching Reply message. A Reply buffer overrun can result in 138 corruption of the Reply message or termination of the transport 139 connection. Therefore reliable reply size estimation is necessary to 140 ensure successful interoperation. This is particularly critical, for 141 example, when allocating a Reply chunk. 143 In many cases the Upper Layer Protocol's XDR definition provides 144 enough information to enable the client to make a reliable prediction 145 of the maximum size of the expected Reply message. If there are 146 variable-size data items in the result, the maximum size of the RPC 147 Reply message can be reliably estimated in most cases: 149 o The client requests only a specific portion of an object (for 150 example, using the "count" and "offset" fields in an NFS READ). 152 o The client has already cached the size of the whole object it is 153 about to request (say, via a previous NFS GETATTR request). 155 o The client and server have negotiated a maximum size for all calls 156 and responses. 158 Subsequent sections of this document describe considerations 159 particular to specific NFS procedures where it is not possible to 160 determine the maximum Reply message size based solely on the above 161 criteria. 163 3. Upper Layer Binding For NFS Versions 2 And 3 165 This Upper Layer Binding specification applies to NFS Version 2 166 [RFC1094] and NFS Version 3 [RFC1813]. For brevity, in this section 167 a "legacy NFS client" refers to an NFS client using NFS version 2 or 168 NFS version 3 to communicate with an NFS server. Likewise, a "legacy 169 NFS server" is an NFS server communicating with clients using NFS 170 version 2 or NFS version 3. 172 The following XDR data items in NFS versions 2 and 3 are DDP- 173 eligible: 175 o The opaque file data argument in the NFS WRITE procedure 177 o The pathname argument in the NFS SYMLINK procedure 179 o The opaque file data result in the NFS READ procedure 181 o The pathname result in the NFS READLINK procedure 183 All other argument or result data items in NFS versions 2 and 3 are 184 not DDP-eligible. 186 A legacy server's response to a DDP-eligibility violation (described 187 in Section 2.1) does not give an indication to legacy clients of 188 whether the server has processed the arguments of the RPC Call, or 189 whether the server has accessed or modified client memory associated 190 with that RPC. 192 A legacy NFS client determines the maximum reply size for each 193 operation using the basic criteria outlined in Section 2.2. 195 3.1. Auxiliary Protocols 197 NFS versions 2 and 3 are typically deployed with several other 198 protocols, sometimes referred to as "NFS auxiliary protocols." These 199 are separate RPC programs that define procedures which are not part 200 of the NFS version 2 or version 3 RPC programs. These include: 202 o The MOUNT and NLM protocols, introduced in an appendix of 203 [RFC1813] 205 o The NSM protocol, described in Chapter 11 of [NSM] 207 o The NFSACL protocol, which does not have a public definition 208 (NFSACL here is treated as a de facto standard as there are 209 several interoperating implementations). 211 RPC-over-RDMA considers these programs as distinct Upper Layer 212 Protocols [I-D.ietf-nfsv4-rfc5666bis]. To enable the use of these 213 ULPs on an RPC-over-RDMA transport, an Upper Layer Binding 214 specification is provided here for each. 216 3.1.1. MOUNT, NLM, And NSM Protocols 218 Typically MOUNT, NLM, and NSM are conveyed via TCP, even in 219 deployments where NFS operations on RPC-over-RDMA. When a legacy 220 server supports these programs on RPC-over-RDMA, it advertises the 221 port address via the usual rpcbind service [RFC1833]. 223 No operation in these protocols conveys a significant data payload, 224 and the size of RPC messages in these protocols is uniformly small. 225 Therefore, no XDR data items in these protocols are DDP-eligible. 226 The largest variable-length XDR data item is an xdr_netobj. In most 227 implementations this data item is not larger than 1024 bytes, making 228 reliable reply size estimation straightforward using the criteria 229 outlined in Section 2.2. 231 3.1.2. NFSACL Protocol 233 Legacy clients and servers that support the NFSACL RPC program 234 typically convey NFSACL procedures on the same connection as the NFS 235 RPC program. This obviates the need for separate rpcbind queries to 236 discover server support for this RPC program. 238 ACLs are typically small, but even large ACLs must be encoded and 239 decoded to some degree. Thus no data item in this Upper Layer 240 Protocol is DDP-eligible. 242 For procedures whose replies do not include an ACL object, the size 243 of a reply is determined directly from the NFSACL program's XDR 244 definition. 246 There is no protocol-wide size limit for NFS version 3 ACLs, and 247 there is no mechanism in either the NFSACL or NFS programs for a 248 legacy client to ascertain the largest ACL a legacy server can store. 249 Legacy client implementations should choose a maximum size for ACLs 250 based on their own internal limits. A recommended lower bound for 251 this maximum is 32,768 bytes. 253 When an especially large ACL is expected, a Reply chunk might be 254 required. If a legacy NFS server indicates that it cannot return an 255 NFSACL GETACL response because the legacy NFS client has not provided 256 a large enough Reply chunk to receive that response, the legacy NFS 257 client can choose to 259 o Terminate the NFSACL GETACL with an error, or 261 o Allocate a larger Reply chunk and send the same NFSACL GETACL 262 request as a new RPC transaction. The NFS client should avoid 263 retrying the request indefinitely. 265 4. Upper Layer Binding For NFS Version 4 267 This Upper Layer Binding specification applies to all protocols 268 defined in NFS Version 4.0 [RFC7530], NFS Version 4.1 [RFC5661], and 269 NFS Version 4.2 [RFC7862]. 271 4.1. DDP-Eligibility 273 Only the following XDR data items in the COMPOUND procedure of all 274 NFS version 4 minor versions are DDP-eligible: 276 o The opaque data field in the WRITE4args structure 278 o The linkdata field of the NF4LNK arm in the createtype4 union 280 o The opaque data field in the READ4resok structure 282 o The linkdata field in the READLINK4resok structure 283 o In minor version 2 and newer, the rpc_data field of the 284 read_plus_content union (further restrictions on the use of this 285 data item follow below). 287 4.1.1. READ_PLUS Replies 289 The NFS version 4.2 READ_PLUS operation returns a complex data type 290 [RFC7862]. The rpr_contents field in the result of this operation is 291 an array of read_plus_content unions, one arm of which contains an 292 opaque byte stream (d_data). 294 The size of d_data is limited to the value of the rpa_count field, 295 but the protocol does not bound the number of elements which can be 296 returned in the rpr_contents array. In order to make the size of 297 READ_PLUS replies predictable by NFS version 4.2 clients, the 298 following restrictions are placed on the use of the READ_PLUS 299 operation on RPC-over-RDMA transports: 301 o An NFS version 4.2 client MUST NOT provide more than one Write 302 chunk for any READ_PLUS operation. When providing a Write chunk 303 for a READ_PLUS operation, an NFS version 4.2 client MUST provide 304 a Write chunk that is either empty (which forces all result data 305 items for this operation to be returned inline) or large enough to 306 receive rpa_count bytes in a single element of the rpr_contents 307 array. 309 o If the Write chunk provided for a READ_PLUS operation by an NFS 310 version 4.2 client is not empty, an NFS version 4.2 server MUST 311 use that chunk for the first element of the rpr_contents array 312 that has an rpc_data arm. 314 o An NFS version 4.2 server MUST NOT return more than two elements 315 in the rpr_contents array of any READ_PLUS operation. It returns 316 as much of the requested byte range as it can fit within these two 317 elements. If the NFS version 4.2 server has not asserted rpr_eof 318 in the reply, the NFS version 4.2 client SHOULD send additional 319 READ_PLUS requests for any remaining bytes. 321 4.2. NFS Version 4 Reply Size Estimation 323 Within NFS version 4, there are certain variable-length result data 324 items whose maximum size cannot be estimated by clients reliably 325 because there is no protocol-specified size limit on these arrays. 326 These include: 328 o The attrlist4 field 329 o Fields containing ACLs such as fattr4_acl, fattr4_dacl, 330 fattr4_sacl 332 o Fields in the fs_locations4 and fs_locations_info4 data structures 334 o Fields opaque to the NFS version 4 protocol which pertain to pNFS 335 layout metadata, such as loc_body, loh_body, da_addr_body, 336 lou_body, lrf_body, fattr_layout_types and fs_layout_types, 338 4.2.1. Reply Size Estimation For Minor Version 0 340 The NFSv4.0 protocol itself does not impose any bound on the size of 341 NFS calls or responses. 343 Some of the data items enumerated in Section 4.2 (in particular, the 344 items related to ACLs and fs_locations) make it difficult to predict 345 the maximum size of NFSv4.0 GETATTR replies that interrogate 346 variable-length attributes. As discussed in Section 2.2, client 347 implementations can rely on their own internal architectural limits 348 to bound the reply size, but such limits are not always guaranteed to 349 be reliable. 351 When an especially large NFSv4.0 GETATTR result is expected, a Reply 352 chunk might be required. If an NFSv4.0 server indicates that it 353 cannot return an NFSv4.0 GETATTR response because the requesting 354 NFSv4.0 client has not provided a large enough Reply chunk to receive 355 that response, the NFSv4.0 client can choose to 357 o Terminate the NFSv4.0 GETATTR with an error, or 359 o Allocate a larger Reply chunk and send the same NFSv4.0 GETATTR 360 request as a new RPC transaction. The NFS client should avoid 361 retrying the request indefinitely. 363 The use of NFS COMPOUND operations raises the possibility of requests 364 that combine a non-idempotent operation (eg. NFS WRITE) with an 365 NFSv4.0 GETATTR that requests one or more variable length results. 366 This combination should be avoided by ensuring that any NFSv4.0 367 GETATTR operation that might return a result of unpredictable length 368 is sent in an NFS COMPOUND by itself. 370 4.2.2. Reply Size Estimation For Minor Version 1 And Newer 372 In NFS version 4.1 and newer minor versions, the csa_fore_chan_attrs 373 argument of the CREATE_SESSION operation contains a 374 ca_maxresponsesize field. The value in this field can be taken as 375 the absolute maximum size of replies generated by a replying NFS 376 version 4 server. 378 This value can be used in cases where it is not possible to estimate 379 a reply size upper bound precisely. In practice, objects such as 380 ACLs, named attributes, layout bodies, and security labels are much 381 smaller than this maximum. 383 4.3. NFS Version 4 COMPOUND Requests 385 The NFS version 4 COMPOUND procedure allows the transmission of more 386 than one DDP-eligible data item per Call and Reply message. An NFS 387 version 4 client provides XDR Position values in each Read chunk to 388 disambiguate which chunk is associated with which argument data item. 389 However NFS version 4 server and client implementations must agree in 390 advance on how to pair Write chunks with returned result data items. 392 The mechanism specified in Section 4.3.2 of 393 [I-D.ietf-nfsv4-rfc5666bis]) is applied here, with additional 394 restrictions that appear below. In the following list, an "NFS Read" 395 operation refers to any NFS Version 4 operation which has a DDP- 396 eligible result data item (i.e., either a READ, READ_PLUS, or 397 READLINK operation). 399 o If an NFS version 4 client wishes all DDP-eligible items in an NFS 400 reply to be conveyed inline, it leaves the Write list empty. 402 o The first chunk in the Write list MUST be used by the first READ 403 operation in an NFS version 4 COMPOUND procedure. The next Write 404 chunk is used by the next READ operation, and so on. 406 o If an NFS version 4 client has provided a matching non-empty Write 407 chunk, then the corresponding READ operation MUST return its DDP- 408 eligible data item using that chunk. 410 o If an NFS version 4 client has provided an empty matching Write 411 chunk, then the corresponding READ operation MUST return all of 412 its result data items inline. 414 o If a READ operation returns a union arm which does not contain a 415 DDP-eligible result, and the NFS version 4 client has provided a 416 matching non-empty Write chunk, an NFS version 4 server MUST 417 return an empty Write chunk in that Write list position. 419 o If there are more READ operations than Write chunks, then 420 remaining NFS Read operations in an NFS version 4 COMPOUND that 421 have no matching Write chunk MUST return their results inline. 423 4.3.1. NFS Version 4 COMPOUND Example 425 The following example shows a Write list with three Write chunks, A, 426 B, and C. The NFS version 4 server consumes the provided Write 427 chunks by writing the results of the designated operations in the 428 compound request (READ and READLINK) back to each chunk. 430 Write list: 432 A --> B --> C 434 NFS version 4 COMPOUND request: 436 PUTFH LOOKUP READ PUTFH LOOKUP READLINK PUTFH LOOKUP READ 437 | | | 438 v v v 439 A B C 441 If the NFS version 4 client does not want to have the READLINK result 442 returned via RDMA, it provides an empty Write chunk for buffer B to 443 indicate that the READLINK result must be returned inline. 445 4.4. NFS Version 4 Callback 447 The NFS version 4 protocols support server-initiated callbacks to 448 notify clients of events such as recalled delegations. 450 4.4.1. NFS Version 4.0 Callback 452 NFS version 4.0 implementations typically employ a separate TCP 453 connection to handle callback operations, even when the forward 454 channel uses a RPC-over-RDMA transport. 456 No operation in the NFS version 4.0 callback RPC program conveys a 457 significant data payload. Therefore, no XDR data items in this RPC 458 program is DDP-eligible. 460 A CB_RECALL reply is small and fixed in size. The CB_GETATTR reply 461 contains a variable-length fattr4 data item. See Section 4.2.1 for a 462 discussion of reply size prediction for this data item. 464 An NFS version 4.0 client advertises netids and ad hoc port addresses 465 for contacting its NFS version 4.0 callback service using the 466 SETCLIENTID operation. 468 4.4.2. NFS Version 4.1 Callback 470 In NFS version 4.1 and newer minor versions, callback operations may 471 appear on the same connection as is used for NFS version 4 forward 472 channel client requests. NFS version 4 clients and servers MUST use 473 the mechanism described in [I-D.ietf-nfsv4-rpcrdma-bidirection] when 474 backchannel operations are conveyed on RPC-over-RDMA transports. 476 The csa_back_chan_attrs argument of the CREATE_SESSION operation 477 contains a ca_maxresponsesize field. The value in this field can be 478 taken as the absolute maximum size of backchannel replies generated 479 by a replying NFS version 4 client. 481 There are no DDP-eligible data items in callback procedures defined 482 in NFS version 4.1 or NFS version 4.2. However, some callback 483 operations, such as messages that convey device ID information, can 484 be large, in which case a Long Call or Reply might be required. 486 When an NFS version 4.1 client can support Long Calls in its 487 backchannel, it reports a backchannel ca_maxrequestsize that is 488 larger than the connection's inline thresholds. Otherwise an NFS 489 version 4 server MUST use only Short messages to convey backchannel 490 operations. 492 4.5. Session-Related Considerations 494 Typically the presence of an NFS session [RFC5661] has no effect on 495 the operation of RPC-over-RDMA. None of the operations introduced to 496 support NFS sessions (eg. SEQUENCE) contain DDP-eligible data items. 497 There is no need to match the number of session slots with the number 498 of available RPC-over-RDMA credits. 500 When an NFS session operates on an RPC-over-RDMA transport, there are 501 a few additional cases where an RPC transaction can fail. For 502 example, a requester might receive, in response to an RPC request, an 503 RDMA_ERROR message with an rdma_err value of ERR_CHUNK, or an 504 RDMA_MSG containing an RPC_GARBAGEARGS reply. These situations are 505 no different from existing RPC errors which an NFS session 506 implementation is already prepared to handle for other transports. 508 As with other transports during such a failure, there might be no 509 SEQUENCE result available to the requester to distinguish whether 510 failure occurred before or after the requested operations were 511 executed on the responder. When a transport error occurs (eg. 512 RDMA_ERROR), the requester proceeds as usual to match the incoming 513 XID value to a waiting RPC Call. The RPC transaction is terminated, 514 and the result status is reported to the Upper Layer Protocol. The 515 requester's session implementation then determines the session ID and 516 slot for the failed request, and performs slot recovery to make that 517 slot usable again. If this is not done, that slot could be rendered 518 permanently unavailable. 520 4.6. Retransmission And Keep-Alive 522 NFS version 4 client implementations often rely on a transport-layer 523 keep-alive mechanism to detect when an NFS version 4 server has 524 become unresponsive. When an NFS server is no longer responsive, 525 client-side keep-alive terminates the connection, which in turn 526 triggers reconnection and RPC retransmission. 528 Some RDMA transports (such as Reliable Connections on InfiniBand) 529 have no keep-alive mechanism. Without a disconnect or new RPC 530 traffic, such connections can remain alive long after an NFS server 531 has become unresponsive. Once an NFS client has consumed all 532 available RPC-over-RDMA credits on that transport connection, it will 533 forever await a reply before sending another RPC request. 535 NFS version 4 clients SHOULD reserve one RPC-over-RDMA credit to use 536 for periodic server or connection health assessment. This credit can 537 be used to drive an RPC request on an otherwise idle connection, 538 triggering either a quick affirmative server response or immediate 539 connection termination. 541 In addition to network partition and request loss scenarios, RPC- 542 over-RDMA connections can be terminated when a Transport header is 543 malformed, messages are larger than receive resources, or when too 544 many RPC-over-RDMA messages are sent at once. In such cases: 546 o If there is a transport error indicated (ie, RDMA_ERROR) before 547 the disconnect or instead of a disconnect, the requester MUST 548 respond to that error as prescribed by the specification of the 549 RPC transport. Then the NFS version 4 rules for handling 550 retransmission apply. 552 o If there is a transport disconnect and the responder has provided 553 no other response for a request, then only the NFS version 4 rules 554 for handling retransmission apply. 556 5. Extending NFS Upper Layer Bindings 558 RPC programs such as NFS are required to have an Upper Layer Binding 559 specification to interoperate on RPC-over-RDMA transports 560 [I-D.ietf-nfsv4-rfc5666bis]. Via standards action, the Upper Layer 561 Binding specified in this document can be extended to cover versions 562 of the NFS version 4 protocol specified after NFS version 4 minor 563 version 2, or separately published extensions to an existing NFS 564 version 4 minor version, as described in [I-D.ietf-nfsv4-versioning]. 566 6. IANA Considerations 568 NFS use of direct data placement introduces a need for an additional 569 NFS port number assignment for networks that share traditional UDP 570 and TCP port spaces with RDMA services. The iWARP [RFC5041] 571 [RFC5040] protocol is such an example (InfiniBand is not). 573 NFS servers for versions 2 and 3 [RFC1094] [RFC1813] traditionally 574 listen for clients on UDP and TCP port 2049, and additionally, they 575 register these with the portmapper and/or rpcbind [RFC1833] service. 576 However, [RFC7530] requires NFS version 4 servers to listen on TCP 577 port 2049, and they are not required to register. 579 An NFS version 2 or version 3 server supporting RPC-over-RDMA on such 580 a network and registering itself with the RPC portmapper MAY choose 581 an arbitrary port, or MAY use the alternative well-known port number 582 for its RPC-over-RDMA service. The chosen port MAY be registered 583 with the RPC portmapper under the netid assigned by the requirement 584 in [I-D.ietf-nfsv4-rfc5666bis]. 586 An NFS version 4 server supporting RPC-over-RDMA on such a network 587 MUST use the alternative well-known port number for its RPC-over-RDMA 588 service. Clients SHOULD connect to this well-known port without 589 consulting the RPC portmapper (as for NFS version 4 on TCP 590 transports). 592 The port number assigned to an NFS service over an RPC-over-RDMA 593 transport is available from the IANA port registry [RFC3232]. 595 7. Security Considerations 597 RPC-over-RDMA supports all RPC security models, including RPCSEC_GSS 598 security and transport-level security [RFC2203]. The choice of what 599 Direct Data Placement mechanism to convey RPC argument and results 600 does not affect this, since it changes only the method of data 601 transfer. Specifically, the requirements of 602 [I-D.ietf-nfsv4-rfc5666bis] ensure that this choice does not 603 introduce new vulnerabilities. 605 Because this document defines only the binding of the NFS protocols 606 atop [I-D.ietf-nfsv4-rfc5666bis], all relevant security 607 considerations are therefore to be described at that layer. 609 8. References 611 8.1. Normative References 613 [I-D.ietf-nfsv4-rfc5666bis] 614 Lever, C., Simpson, W., and T. Talpey, "Remote Direct 615 Memory Access Transport for Remote Procedure Call, Version 616 One", draft-ietf-nfsv4-rfc5666bis-09 (work in progress), 617 January 2017. 619 [I-D.ietf-nfsv4-rpcrdma-bidirection] 620 Lever, C., "Bi-directional Remote Procedure Call On RPC- 621 over-RDMA Transports", draft-ietf-nfsv4-rpcrdma- 622 bidirection-06 (work in progress), January 2017. 624 [RFC1833] Srinivasan, R., "Binding Protocols for ONC RPC Version 2", 625 RFC 1833, DOI 10.17487/RFC1833, August 1995, 626 . 628 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 629 Requirement Levels", BCP 14, RFC 2119, 630 DOI 10.17487/RFC2119, March 1997, 631 . 633 [RFC2203] Eisler, M., Chiu, A., and L. Ling, "RPCSEC_GSS Protocol 634 Specification", RFC 2203, DOI 10.17487/RFC2203, September 635 1997, . 637 [RFC5661] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., 638 "Network File System (NFS) Version 4 Minor Version 1 639 Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010, 640 . 642 [RFC7530] Haynes, T., Ed. and D. Noveck, Ed., "Network File System 643 (NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530, 644 March 2015, . 646 [RFC7862] Haynes, T., "Network File System (NFS) Version 4 Minor 647 Version 2 Protocol", RFC 7862, DOI 10.17487/RFC7862, 648 November 2016, . 650 8.2. Informative References 652 [I-D.ietf-nfsv4-versioning] 653 Noveck, D., "Rules for NFSv4 Extensions and Minor 654 Versions", draft-ietf-nfsv4-versioning-09 (work in 655 progress), December 2016. 657 [NSM] The Open Group, "Protocols for Interworking: XNFS, Version 658 3W", February 1998. 660 [RFC1094] Nowicki, B., "NFS: Network File System Protocol 661 specification", RFC 1094, DOI 10.17487/RFC1094, March 662 1989, . 664 [RFC1813] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS 665 Version 3 Protocol Specification", RFC 1813, 666 DOI 10.17487/RFC1813, June 1995, 667 . 669 [RFC3232] Reynolds, J., Ed., "Assigned Numbers: RFC 1700 is Replaced 670 by an On-line Database", RFC 3232, DOI 10.17487/RFC3232, 671 January 2002, . 673 [RFC5040] Recio, R., Metzler, B., Culley, P., Hilland, J., and D. 674 Garcia, "A Remote Direct Memory Access Protocol 675 Specification", RFC 5040, DOI 10.17487/RFC5040, October 676 2007, . 678 [RFC5041] Shah, H., Pinkerton, J., Recio, R., and P. Culley, "Direct 679 Data Placement over Reliable Transports", RFC 5041, 680 DOI 10.17487/RFC5041, October 2007, 681 . 683 [RFC5667] Talpey, T. and B. Callaghan, "Network File System (NFS) 684 Direct Data Placement", RFC 5667, DOI 10.17487/RFC5667, 685 January 2010, . 687 Appendix A. Changes Since RFC 5667 689 Corrections and updates made necessary by new language in 690 [I-D.ietf-nfsv4-rfc5666bis] have been introduced. For example, 691 references to deprecated features of RPC-over-RDMA Version One, such 692 as RDMA_MSGP, and the use of the Read list for handling RPC replies, 693 have been removed. The term "mapping" has been replaced with the 694 term "binding" or "Upper Layer Binding" throughout the document. 695 Some material that duplicates what is in [I-D.ietf-nfsv4-rfc5666bis] 696 has been deleted. 698 Material required by [I-D.ietf-nfsv4-rfc5666bis] for Upper Layer 699 Bindings that was not present in [RFC5667] has been added, including 700 discussion of how each NFS version properly estimates the maximum 701 size of RPC replies. 703 Technical corrections have been made. For example, the mention of 704 12KB and 36KB inline thresholds have been removed. The reference to 705 a non-existant NFS version 4 SYMLINK operation has been replaced with 706 NFS version 4 CREATE(NF4LNK). 708 The discussion of NFS version 4 COMPOUND handling has been completed. 709 Some changes were made to the algorithm for matching DDP-eligible 710 results to Write chunks. 712 Requirements to ignore extra Read or Write chunks have been removed 713 from the NFS version 2 and 3 Upper Layer Binding, as they conflict 714 with [I-D.ietf-nfsv4-rfc5666bis]. 716 A complete discussion of reply size estimation has been introduced 717 for all protocols covered by the Upper Layer Bindings in this 718 document. 720 A section discussing NFS version 4 retransmission and connection loss 721 has been added. 723 The following additional improvements have been made, relative to 724 [RFC5667]: 726 o An explicit discussion of NFS version 4.0 and NFS version 4.1 727 backchannel operation has replaced the previous treatment of 728 callback operations. 730 o A binding for NFS version 4.2 has been added that includes 731 discussion of new data-bearing operations like READ_PLUS. 733 o A section suggesting a mechanism for periodically assessing 734 connection health has been introduced. 736 o Language inconsistent with or contradictory to 737 [I-D.ietf-nfsv4-rfc5666bis] has been removed from Sections 2 and 738 3, and both Sections have been combined into Section 2 in the 739 present document. 741 o Ambiguous or erroneous uses of RFC2119 terms have been corrected. 743 o References to obsolete RFCs have been updated. 745 o An IANA Considerations Section has replaced the "Port Usage 746 Considerations" Section. 748 o Code excerpts have been removed, and figures have been modernized. 750 Appendix B. Acknowledgments 752 The author gratefully acknowledges the work of Brent Callaghan and 753 Tom Talpey on the original NFS Direct Data Placement specification 754 [RFC5667]. The author also wishes to thank Bill Baker and Greg 755 Marsden for their support of this work. 757 Dave Noveck provided excellent review, constructive suggestions, and 758 consistent navigational guidance throughout the process of drafting 759 this document. Dave also contributed the text of Section 4.5 761 Thanks to Karen Deitke for her sharp observations about idempotency, 762 and the clarity of the discussion of NFS COMPOUNDs and NFS sessions. 764 Special thanks go to Transport Area Director Spencer Dawkins, nfsv4 765 Working Group Chair Spencer Shepler, and nfsv4 Working Group 766 Secretary Thomas Haynes for their support. 768 Author's Address 770 Charles Lever (editor) 771 Oracle Corporation 772 1015 Granger Avenue 773 Ann Arbor, MI 48104 774 USA 776 Phone: +1 248 816 6463 777 Email: chuck.lever@oracle.com