commit 0a7d0c30a940dbbafe3f97fa222750d95870df93 Author: Benjamin Kaduk Date: Fri Sep 18 08:56:44 2020 -0700 Make OpenAFS 1.9.0 Update version strings for the first 1.9.x development release. Change-Id: I0d0e204ffe8d64d7c0f794f313c0f24ccea12783 Reviewed-on: https://gerrit.openafs.org/14362 Reviewed-by: Michael Meffie Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 26a3f43a18508aa6fe63ad267f3127555f123ab9 Author: Benjamin Kaduk Date: Fri Sep 4 08:56:36 2020 -0700 Import NEWS from OpenAFS 1.8.6 Stay up to date with the stable branch at least until the initial version of the new release series. Change-Id: Iefcd9cc039399cd4cbbcc0474c2cabffa7780305 Reviewed-on: https://gerrit.openafs.org/14344 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 67a4279b65cc5082e23e72964b3974e17eeb77a9 Author: Benjamin Kaduk Date: Fri Sep 4 08:55:19 2020 -0700 Update 1.9.0 NEWS for recent changes Add some entries for the commits that landed since the previous update. Change-Id: I74820ee5a07c3fb539f233b2bd0c30aab262ba74 Reviewed-on: https://gerrit.openafs.org/14343 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit d2e755e33a266df17169a1fc05db1e540b5e76af Author: Mark Vitale Date: Tue May 12 12:59:31 2020 -0400 DARWIN: disable kextutil check for versions requiring notarization Our kextutil signing check will fail for releases that require notarization (Mojave 10.14.5 and up, Catalina 10.15 all versions), because we aren't notarized yet at the time of the check. Instead, disable the check for those releases. Change-Id: Iec1b74d18ae02cdd031ed3194ffb9900aa8a1b55 Reviewed-on: https://gerrit.openafs.org/14222 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 4d6c255c816c0a4f765048792dea34671fff6e87 Author: Thomas L. Kula Date: Thu May 14 14:08:40 2009 -0400 dumpscan: Don't call cb_dirent twice This fixes a bug where p->cb_dirent is called twice, if it exists. Change-Id: I7a7a6abf522b62eb310d003a61b3bbcdcda9e850 Reviewed-on: https://gerrit.openafs.org/14308 Reviewed-by: Andrew Deason Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 85893ac3df0c2cb48776cf1203ec200507b6ce7d Author: Marcio Barbosa Date: Mon Aug 31 19:56:56 2020 +0000 Revert "vos: take RO volume offline during convertROtoRW" This reverts commit 32d35db64061e4102281c235cf693341f9de9271. While that commit did fix the mentioned problem, depending on "vos" to set the volume to be converted as "out of service" is not ideal. Instead, this volume should be set as offline by the SAFSVolConvertROtoRWvolume RPC, executed on the volume server. The proper fix for this problem will be introduced by another commit. Change-Id: I0ce5ba793fe3c07e535225191b74eeb402ab5bfd Reviewed-on: https://gerrit.openafs.org/14339 Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 8b68f1a4e1e3ae06de0d6c5a8af60ef99cacb83a Author: Michael Meffie Date: Mon Aug 24 13:12:13 2020 -0400 build: Add rpm target Add a top-level makefile target to build RPMs for Red Hat distributions from the currently checked out commit. The resulting rpms are placed in the packages/rpmbuild/RPMS/ directory. The rpm target is intended to be a convenience for testing changes to the rpm packaging or generating packages for local testing. Change-Id: Id951eb2b03629be59f6258e89e8356fe1fde1ff5 Reviewed-on: https://gerrit.openafs.org/14114 Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 7cc6b97ad26089ecb88019468f3ef7c0222cebe1 Author: Michael Meffie Date: Fri May 1 14:05:24 2020 -0400 makesrpm: Support custom version strings The makesrpm.pl script generates a source RPM by creating a temporary rpmbuild workspace, populating the SOURCES and SPECS directories in that workspace, running rpmbuild to build the source RPM, and finally copying the resulting source RPM out of the temporary workspace. The name of the source RPM file created by rpmbuild depends on the package version and release strings. Unfortunately, the format of the source RPM file name changed around OpenAFS 1.6.0, so makesrpm.pl has special logic to find the version string and extra code depending on the detected OpenAFS version. Instead of trying to predict the name of the resulting source RPM file from the OpenAFS version string, and having different logic for old versions of OpenAFS, use a filename glob to find resulting source RPM file name in the temporary rpmbuild workspace. Remove the major, minor, and patch level variables, which were only used to guess the name of the resulting source RPM file name. Convert '-' characters to '_' in the package version and package release, since the '-' character is reserved by rpm as a field separator. While here, add the --dir option to specify the path of the generated source RPM, and change the 'srpm' makefile target to use the new --dir option, instead of changing the current directory before running makesrpm.pl. Also, add a dependency on the 'dist' makefile target, since the the source and document tarballs are required to build the source RPM. Add pod documentation and add the --help (-h) option to print a brief help message, and add the --man option to print the full man page. With this change, we can build a source RPM even when the .version file in the src.tar.bz file has a custom format or was created from a checkout of the master branch or other non-release reference. Change-Id: I7320afe6ac1f77d4dd38fcc194d41678fde5c950 Reviewed-on: https://gerrit.openafs.org/14116 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 4f78b3fdf1b6df9a5da85fc8bcfae28857081799 Author: Stephan Wiesand Date: Tue Aug 25 23:34:39 2020 +0200 Correct our contributor's code of conduct There are no races. Racism does exist though. Change-Id: I0a4cde55a5f470649eb99c5d7f30c9cec86d9baa Reviewed-on: https://gerrit.openafs.org/14320 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit c4f853aa00f1650b678cbd22ad1e2a9cf01c1303 Author: Andrew Deason Date: Wed Aug 26 15:41:00 2020 -0500 UKERNEL: Build linktest with COMMON_CFLAGS Currently, 'linktest' in libuafs is built with a weird custom rule that specifies several various CFLAGS and LDFLAGS, etc. One side-effect of this is that linktest is built without specifying -O, even if optimization is otherwise enabled. Normally nobody would care about the optimization of linktest, since it's never supposed to be run, but this can cause an error when building with -D_FORTIFY_SOURCE=1 on some systems (such as RHEL7): In file included from /usr/include/sys/types.h:25:0, from /.../src/config/afsconfig.h:1485, from /.../src/libuafs/linktest.c:15: /usr/include/features.h:330:4: error: #warning _FORTIFY_SOURCE requires compiling with optimization (-O) [-Werror=cpp] # warning _FORTIFY_SOURCE requires compiling with optimization (-O) ^ cc1: all warnings being treated as errors make[3]: *** [linktest] Error 1 For now, to fix this just include $(COMMON_CFLAGS) in the flags we give for linktest, so $(OPTMZ) also gets pulled in, and building linktest gets a little closer to a normal compilation step. Change-Id: I3362dcfe8407825ab88854ae59da4188ed16be9d Reviewed-on: https://gerrit.openafs.org/14324 Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 696f2ec67b049639abf04905255a7d6173dbb19e Author: Jan Iven Date: Tue Sep 1 14:51:25 2020 +0200 ptserver: Remove duplicate ubik_SetLock in listSuperGroups It looks like a call to ubik_SetLock(.. LOCKREAD) was left in place in listSuperGroups after locking was moved to ReadPreamble in commit a6d64d70 (ptserver: Refactor per-call ubik initialisation) When compiled with 'supergroups', and once contacted by "pts mem -expandgroups ..", ptserver will therefore abort() with Ubik: Internal Error: attempted to take lock twice This patch removes the superfluous ubik_SetLock. FIXES 135147 Change-Id: I8779710a6d68e4126fc482123b576690d86e4225 Reviewed-on: https://gerrit.openafs.org/14338 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 16bae98ec525fa013514fb46398df682d7637ae0 Author: Cheyenne Wills Date: Mon Aug 24 11:10:30 2020 -0600 INSTALL: document the minimum Linux kernel level The change associated with gerrit #14300 removed support for older Linux kernels (2.6.10 and earlier). The commit 'Import of code from autoconf-archive' (d8205bbb4) introduced a check for Autoconf 2.64. Autoconf 2.64 was released in 2009. The commit 'regen.sh: Use libtoolize -i, and .gitignore generated build-tools' (a7cc505d3) introduced a dependency on libtool's '-i' option. Libtool supported the '-i' option with libtool 1.9b in 2004. Update the INSTALL instructions to document a minimum Linux kernel level and the minimum levels for autoconf and libtool. Notes: RHEL4 (EOL in 2017) had a 2.6.9 kernel and RHEL5 has a 2.6.18 kernel. RHEL5 has libtool 1.5.22 and autoconf 2.59, RHEL6 has libtool 2.2.6 and autoconf 2.63, and RHEL7 has libtool 2.4.2 and autoconf 2.69. Change-Id: I235eeffa4adb152e05aab7aca839700816e62c83 Reviewed-on: https://gerrit.openafs.org/14305 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit b968875a342ba8f11378e76560b46701f21391e8 Author: Yadavendra Yadav Date: Fri Aug 21 01:54:00 2020 +0530 afs: Avoid NatPing event on all connection Inside release_conns_user_server, connection vector is traversed and after destroying a connection new eligible connection is found on which NatPing event will be set. Ideally there should be only one connection on which NatPing should be set but in current code while traversing all connection of server a NatPing event is set on all connections to that server. In cases where we have large number of connection to a server this can lead to huge number of “RX_PACKET_TYPE_VERSION” packets sent to a server. Since this happen during Garbage collection of user structs, to simulate this issue below steps were tried - had one script which “cd” to a volume mount and then script sleeps for large time. - Ran one infinite while loop where above script was called using PAG based tokens (As new connection will be created for each PAG) - Instrumented the code, so that we hit above code segment where NatPing event is set. Mainly reduced NOTOKTIMEOUT to 60 sec. To fix this issue set NatPing on one connection and once it is set break from “for” loop traversing the server connection. Change-Id: Ia38cec0403fde76cdd59aa664bd261481e2edee6 Reviewed-on: https://gerrit.openafs.org/14312 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk Reviewed-by: Andrew Deason commit 291bad659e26c21332abd2954ee8d49fccad90da Author: Mark Vitale Date: Mon Apr 20 14:51:08 2020 -0400 vos: avoid 'half-locked' volume after interrupted 'vos rename' Reported symptoms: If a 'vos rename' is interrupted after it has locked the volume and replaced the VLDB entry, but before it has unlocked the volume, the volume will remain locked. However, the locked volume will NOT be listed as locked in any vos commands that display locked status (see below for details). Background: Most vos write operations lock the VLDB volume entry before proceeding, then release the volume lock when finished. This is accomplished via VL_SetLock and VL_ReleaseLock, respectively. VL_SetLock always sets these members in the VLDB volume entry: - flags is modified to set the required VLOP_* code bit as specified - LockAFSid is set to 0 (never implemented) - LockTimestamp is set to the current time VL_ReleaseLock always sets them as follows: - flags is cleared of any VLOP_* code bit - LockAFSid is set to 0 (never implemented) - LockTimestamp is set to 0 VL_ReplaceEntry(N) may also optionally clear each of these members: - flags operation bits may be explicitly cleared via LOCKREL_OPCODE - LockAFSid may be explicitly cleared via LOCKREL_AFSID - LockTimestamp may be explicitly cleared via LOCKREL_TIMESTAMP When all 3 options are specified, VL_ReplaceEntry also does the functional equivalent of a VL_ReleaseLock. Most vos operations use this method. However, when no lock release options are specified on VL_ReplaceEntry(N), the VLDB entry is simply replaced with the supplied entry. This includes whatever flags values are specified in the supplied entry; therefore, this amounts to an additional, implicit way to set or modify the flags. Root cause: 'vos rename' (UV_RenameVolume) is the only vos operation that does all of the following things: - accepts a replacement volume entry that was obtained before VL_SetLock (and thus does NOT have any lock flags set) - issues VL_SetLock (which sets the lock flag in the VLDB) - issues VL_ReplaceEntry(N) with the original unlocked entry, and with no lock release options (thus with explicit intent to leave the lock flag unchanged, but inadvertently doing an implicit clear of the lock flag in the VLDB) - (performs some additional volserver work) - issues VL_ReleaseLock to release the volume lock Therefore, if 'vos rename' is cancelled or killed before reaching the final VL_ReleaseLock step, the VLDB entry is left with the lock flags cleared but the LockTimestamp still set. As we will see below, this 'half-locked' state produces confusing results from other vos commands. Detection of locked state: The 'vos lock' command (and all other vos commands that issue VL_SetLock) use the lock timestamp to determine if a volume is locked. However, several other vos commands ('vos listvldb ', 'vos examine ', 'vos listvldb -locked') use the VLDB entry's lock flags (not the lock timestamp) to determine if the volume is locked. Therefore, if the lock flags have been cleared but the lock timestamp is still set, these commands fail to detect that the volume is still locked. Yet an administrator's 'vos lock ' will still fail with: Could not lock VLDB entry for volume VLDB: vldb entry is already locked This is the external manifestation of the 'half-locked' state. Workaround and fix: This scenario has a simple workaround: 'vos unlock '. However, to avoid this confusing outcome in the first place, modify the 'vos rename' logic so that the lock flags are no longer inadvertently cleared. Now, if the 'vos rename' is interrupted before the volume is unlocked, it will still appear locked in normal vos command output. Change-Id: I6cc16d20c4487de4e9a866c6f0c89d950efd2f7d Reviewed-on: https://gerrit.openafs.org/14157 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Michael Meffie Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 21cd26cb0d0a37d9412c0285a3c73c693222fd8a Author: Mark Vitale Date: Tue Aug 25 12:37:09 2020 -0400 rxgen: remove dead code hndle_param_tail Since the original IBM code import, hndle_param_tail has been dead code. It was later ifdef'd out in commit 8f2df21ffe59 'pull-prototypes-to-head-20020821' Remove the dead code from the tree. No functional change is incurred by this commit. Change-Id: I29128eecc93a5871f5bb9369c3983baf5b537beb Reviewed-on: https://gerrit.openafs.org/14322 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit d5f0e16ac44475be55a7cc3e2895fc4a3a923ece Author: Marcio Barbosa Date: Tue Aug 18 13:56:26 2020 +0000 bos: suppress unnecessary warn if -noauth Commit d008089a7 (Add interface to select client security objects) consolidated the code that selects the client security objects into a set of new interfaces. Before this commit, the "bos: running unauthenticated" message, which warns the user when an unauthenticated connection is established, used to be suppressed if the -noauth flag was specified. Similarly to commit b3c16324e (ubik: Make ugen_ClientInit honor noAuthFlag), recover the original behavior avoiding warn messages about unauthenticated connections if the -noauth flag is provided. Change-Id: Iaf0ac6bd91ea160256823512f060afc94b5926bf Reviewed-on: https://gerrit.openafs.org/14306 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 904f5bd398db248c11b30ef7e360ce5141dcd1f3 Author: Michael Meffie Date: Thu Apr 16 16:29:09 2020 -0400 vlserver: fix missing read-only entries from ListAttributesN2 The ListAttributesN2() RPC can fail to list read-only entries under certain circumstances. This RPC is used by the `vos listvldb` command to retrieve vldb entries (unless the -name option is given). The `vos listvldb` command fails to list volume entries when run with the '-server' option for volumes that have read-only replicas, but have not been released. Consider the following example volume: $ vos create fs1.example.com a test $ vos addsite fs1.example.com a test $ vos addsite fs2.example.com a test $ vos listvldb ... test RWrite: 536870921 number of sites -> 3 server fs1.example.com partition /vicepa RW Site server fs1.example.com partition /vicepa RO Site -- Not released server fs2.example.com partition /vicepa RO Site -- Not released `vos listvldb` fails to find the volume when the search is limited to server 'fs2': $ vos listvldb -server fs2.example.com VLDB entries for server fs2.example.com Total entries: 0 Instead of the expected results: $ vos listvldb -server fs2.example.com test RWrite: 536870921 number of sites -> 3 server fs1.example.com partition /vicepa RW Site server fs1.example.com partition /vicepa RO Site -- Not released server fs2.example.com partition /vicepa RO Site -- Not released This situation makes it difficult to remove old server addresses from the vldb. In this situation, 'vos remaddrs' and 'vos changeaddr -remove' commands will complain the server addresses are still in use by volume entries, however running 'vos listvldb -server' will not show which volumes entries are in use. The entries are not listed for unreleased volumes because the ListAttributesN2() RPC is currently checking the volume VLF_ROEXISTS flag, instead of the server site flags (serverFlags) to determine when the entry is a read-only site. The volume VLF_ROEXISTS flag is set when a volume is released. To fix this, make ListAttributesN2 check for the VLSF_ROVOL site flag, instead of the VLF_ROEXISTS entry flag. Change-Id: Ib636fbe016d1d2f5b117624d9930dba83ebcef8a Reviewed-on: https://gerrit.openafs.org/14154 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 13a49aaf0d5c43bce08135edaabb65587e1a8031 Author: Cheyenne Wills Date: Mon Aug 17 08:20:11 2020 -0600 LINUX 5.9: Remove HAVE_UNLOCKED_IOCTL/COMPAT_IOCTL Linux-5.9-rc1 commit 'fs: remove the HAVE_UNLOCKED_IOCTL and HAVE_COMPAT_IOCTL defines' (4e24566a) removed the two referenced macros from the kernel. The support for unlocked_ioctl and compat_ioctl were introduced in Linux 2.6.11. Remove references to HAVE_UNLOCKED_IOCTL and HAVE_COMPAT_IOCTL using the assumption that they were always defined. Notes: With this change, building against kernels 2.6.10 and older will fail. RHEL4 (EOL in March 2017) used a 2.6.9 kernel. RHEL5 uses a 2.6.18 kernel. In linux-2.6.33-rc1 the commit messages for "staging: comedi: Remove check for HAVE_UNLOCKED_IOCTL" (00a1855c) and "Staging: comedi: remove check for HAVE_COMPAT_IOCTL" (5d7ae225) both state that all new kernels have support for unlocked_ioctl/compat_ioctl so the checks can be removed along with removing support for older kernels. Change-Id: Idd2716f3573ea455f8a5e1535bca584af0787717 Reviewed-on: https://gerrit.openafs.org/14300 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit f5051b87a56b3a4f7fd7188cbd16a663eee8abbf Author: Michael Meffie Date: Fri May 15 12:01:44 2020 -0400 vos: avoid CreateVolume when restoring over an existing volume Currently, the UV_RestoreVolume2 function always attempts to create a new volume, even when doing a incremental restore over an existing volume. When the volume already exists, the volume creation operation fails on the volume server with a VVOLEXISTS error. The client will then attempt to obtain a transaction on the existing volume. If a transaction is obtained, the incremental restore operation will proceed. If a full restore is being done, the existing volume is removed and a new empty volume is created. Unfortunately, the failed volume creation is logged to by the volume server, and so litters the log file with: Volser: CreateVolume: Unable to create the volume; aborted, error code 104 To avoid polluting the volume server log with these messages, reverse the logic in UV_RestoreVolume2. Assume the volume already exists and try to get the transaction first when doing an incremental restore. Create a new volume if the transaction cannot be obtained because the volume is not present. When doing a full restore, remove the existing volume, if one exists, and then create a new empty volume. Change-Id: I8bdc13130d12c81cd2cd18a9484852708cac64d7 Reviewed-on: https://gerrit.openafs.org/14208 Tested-by: BuildBot Reviewed-by: Marcio Brito Barbosa Tested-by: Marcio Brito Barbosa Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 624219a1b2192e5c7b6b45e2cbe784a9c5f33a96 Author: Michael Meffie Date: Tue Aug 4 10:34:07 2020 -0400 tests: Accommodate c-tap-harness 4.7 The SOURCE and BUILD environment variables have been changed to C_TAP_SOURCE and C_TAP_BUILD in the new version of c-tap-harness. The runtests command syntax has changed as well. Convert all of the old SOURCE and BUILD environment variables to the new C_TAP_SOURCE and C_TAP_BUILD names. Add the required -l command line option to specify the test list. Add the new runtests -v option to run the tests in verbose mode to make it easier to see which tests failed. Change-Id: I209a6dc13d6cd1507519234fce1564fc4641e70b Reviewed-on: https://gerrit.openafs.org/14295 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 3f377aa117273eba5c77ad652c0b086446b3f874 Author: Russ Allbery Date: Mon Aug 3 20:59:25 2020 -0400 Import of code from c-tap-harness This commit updates the code imported from c-tap-harness to abdb66561ffd4d2f238fdb06f448ccf09d80c059 (release/4.7) Upstream changes are: Daniel Collins (1): Add is_blob() test function. Daniel Kahn Gillmor (1): LICENSE: use https for all URLs Daria Brashear (1): Add verbose mode environment variable to runtests Julien ÉLIE (2): Document -v in usage and comments of runtests Avoid realloc of zero length in tests/runtests.c Marc Dionne (1): Add test_cleanup_register_with_data Russ Allbery (115): clang --analyze cleanups for runtests Modernize POD tests Update README to my current layout Explicitly note that test programs must be executable Fix comment typo in tests/runtests.c Switch to a copyright-format 1.0 LICENSE file Flush harness output after each line Show the test count as ? when the plan is deferred More correctly backspace over test counts when aborting Refactor test list handling Allow passing tests on the runtests command line Don't allow command-line arguments if a list was given Search for tests under the name given as well Release 2.0 Fix backward incompatibility when searching for tests Document decision to ignore TAP version directives Release 2.1 Document different runtests behavior in bail handling Change exit status of bail to 255 Release 2.2 Add a new test_cleanup_register C API Add warn_unused_result attributes Add portability for warn_unsed_result attributes to tap/macros.h Minor coding style fix (spacing) in runtests.c Split the runtests usage string for ISO C90 string limits Include stddef.h Diagnose failure to register the exit handler Use diag internally in the basic C TAP library Some additional comments about cleanup functions Move repetitive printing code in the C TAP library to a macro Set a flag when bailing for more correct cleanup Change my email address to eagle@eyrie.org Release 2.3 Add diag_file_add and diag_file_remove functions Don't die for unknown files passed to diag_file_remove Release 2.4 Update comment about AIX and WCOREDUMP Don't test for NULL before calling free Be more careful about file descriptors in child processes Run cleanup functions in non-primary processes as well Release 3.0 Update collective package copyright notices at start of LICENSE Check integer overflows on memory allocation, fix string creation Switch POD spelling test to use Lancaster consensus variable Add new bnrealloc API for brealloc with checked multiplication Rename nrealloc to reallocarray Return the test status from test functions Fix the overflow check for breallocarray Fix the overflow check for xreallocarray in runtests Restructure test result reallocation in runtests Change diag and sysdiag to always return true Release 3.1 Fix typos in basic.c and basic.h Fix usage message when running runtests with no arguments Update introductory runtests comments for current syntax Add the -l flag to suggested runtests invocation in README Support comments and blank lines in test lists Release 3.2 Update licensing information Various improvements to verbose support Compile warning-free with Clang, check Autoconf macros Release 3.3 Remove unnecessary assert.h include in tap/basic.c Fix some additional -v documentation issues Rebalance usage to avoid too-long strings Fix segfault in runtests with empty test list Release 3.4 Document running autogen if starting from Git Rename autogen to bootstrap Support and prefer C_TAP_SOURCE and C_TAP_BUILD Fix comment typo in tests/runtests.c Add missing va_end to is_double Release 4.0 Fix all non-https www.eyrie.org URLs Add is_bool C test function Add DocKnot metadata and a Markdown README file Update documentation for new DocKnot standards Release 4.1 Use more defaults from DocKnot templates Fix new fall-through warning in GCC 7 Use compiler warnings from rra-c-util, fix issues Merge pull request #4 from solemnwarning/master Coding style fixes and NEWS for is_blob Re-enable -Wunknown-pragmas for GCC Avoid zero-length realloc allocations in breallocarray Update copyright date on tests/runtests.c Release 4.2 Add SPDX-License-Identifier headers to source files Add and run new check-cppcheck target Fix instructions for running one test Identify values as left and right Fix is_string comparisons with NULL pointers Add support for running tests under valgrind Replace putc with fprintf Update shared files from rra-c-util Release 4.3 Update NEWS date for 4.3 release Collapse some copyright dates NEWS and coding style for test_cleanup_register_with_data Remove unused variables caught by Clang scan-build Update to rra-c-util 8.0 Fix error checking in bstrndup Release 4.4 Add support for C++ Document that C TAP Harness can be built as C++ Release 4.5 Regenerate README files Reformat using clang-format 10 Update to rra-c-util 8.1 Release 4.6 Fix spelling errors caught by codespell Protect the test suite against C_TAP_VERBOSE Switch to GitHub Actions for CI Add NEWS entry for GCC 10 warning fixes Release 4.7 Change-Id: I5a78215bf99b53bd848f0fa6bb9092deab38f24e Reviewed-on: https://gerrit.openafs.org/14294 Reviewed-by: Andrew Deason Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit eccd4b9778014c36a4b3af6d9e80194066bd2195 Author: Andrew Deason Date: Tue Jun 2 13:37:00 2020 -0500 afs: Always define our own osi_timeval32_t Since OpenAFS 1.0, osi_GetTime has taken a timeval-like pointer, which contains 32-bit fields (the actual type has been called either osi_timeval_t or osi_timeval32_t over time). For platforms that have a native timeval-like type with 32-bit fields, we just define osi_timeval32_t to that type, and elsewhere we define our own struct to be osi_timeval32_t. For platforms that use the native timeval, we can then define osi_GetTime() to just be, e.g., microtime(). This approach is difficult to maintain, though, because we must keep track of whether 'struct timeval' contains 32-bit fields on each platform, which can depend on many factors. It's easy to make mistakes (the current tree already contains mistakes), and there's not much benefit. To avoid all of this, just always define osi_timeval32_t to be our own struct with afs_int32 fields, and provide definitions for osi_GetTime that convert from the native time struct to our osi_timeval32_t. This does mean that for some platforms we do an unnecessary type conversion, but this is a small price to pay for more straightforward and maintainable code. To be a little more sure that our types are correct, change osi_GetTime to be defined as an inline function instead of a macro. At the same time, do a similar conversion for the KERNEL implementation of the rx clock_GetTime function. Get rid of platform-specific mess, and do a straightforward type conversion between osi_timeval32_t and struct clock in an inline function. Change-Id: I18819acb556a2a7f1b6da6994db9783c48108934 Reviewed-on: https://gerrit.openafs.org/14238 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit a5c3dfe99fa1831e3b416e89f52a03fd1cf9f73d Author: Andrew Deason Date: Tue Jun 2 13:12:14 2020 -0500 afs: Move osi_GetTime out of param.h Most platforms currently #define osi_GetTime in their param.h. This is really redundant, since the definition of osi_GetTime almost never changes for a given platform, so we end up with many copies of the same osi_GetTime definition for a given platform. Move osi_GetTime out of param.h for these platforms, and define it in osi_machdep.h instead, which is where most platform-specific definitions go. For DFBSD, we don't have an osi_machdep.h at all yet, so create a new one to contain the osi_GetTime definition. Currently we don't build libafs at all on DFBSD, but do this anyway so we don't lose the existing osi_GetTime definition. For NBSD, we were providing (conflicting!) definitions for osi_GetTime in param.h and in osi_machdep.h. Just remove the definitions in param.h, since those should have been getting overridden by the osi_machdep.h definition. Change-Id: I7097d9fe2fcd38c06ecc275e8fe3a2c69c9d0436 Reviewed-on: https://gerrit.openafs.org/14237 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit c56873bf95f6325b70e63ed56ce59a3c6b2b753b Author: Cheyenne Wills Date: Mon Jul 27 12:31:35 2020 -0600 afs: Avoid using logical OR when setting f_fsid Building with clang-10 produces the warning/error message warning: converting the result of '<<' to a boolean always evaluates to true [-Wtautological-constant-compare] for the expression abp->f_fsid = (AFS_VFSMAGIC << 16) || AFS_VFSFSID; The message is because a logical OR '||' is used instead of a bitwise OR '|'. The result of this expression will always set the f_fsid member to a 1 and not the intended value of AFS_VFSMAGIC combined with AFS_VFSFSID. Update the expression to use a bitwise OR instead of the logical OR. Note: This will change value stored in the f_fsid that is returned from statfs. Using a logical OR has existed since OpenAFS 1.0 for hpux/solaris and in UKERNEL since OpenAFS 1.5 with the commit 'UKERNEL: add uafs_statvfs' b822971a. Change-Id: I3e85ba48058ac68e3e3ac7f277623f660187926c Reviewed-on: https://gerrit.openafs.org/14292 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 446457a1240b88fd94fc34ff5715f2b7f2f3ef12 Author: Cheyenne Wills Date: Mon Jul 27 12:31:03 2020 -0600 afs: Set AFS_VFSFSID to a numerical value Currently when UKERNEL is defined, AFS_VFSFSID is always set to AFS_MOUNT_AFS, which is a string for many platforms for UKERNEL. Update src/afs/afs.h to insure that the define for AFS_VFSFSID is a numeric value when building UKERNEL. Clean up the preprocessor indentation in src/afs/afs.h in the area around the AFS_VFSFSID defines. Thanks to adeason@sinenomine.net for pointing out a much easier solution for resolving this problem. Change-Id: I618fc4c89029a6cca2ca6f530b8f65399299a9d1 Reviewed-on: https://gerrit.openafs.org/14279 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit e5f44f6e9af643cab3a66216dff901e0a4c5eda8 Author: Cheyenne Wills Date: Thu Jul 23 15:43:42 2020 -0600 clang-10: ignore fallthrough warning in generated code Clang-10 will not recognize '/* fall through */' as an indicator to turn off the fallthrough warning due to the lack of a 'break' in a case statement. Code generated by flex uses the '/* fall through */' comments to turn off compiler warnings for fallthroughs in case statements. For code generated by flex, ignore the implicit-fallthrough via pragma or disable the warning via a compile time flag. Add new env variable "CFLAGS_NOIMPLICIT_FALLTHROUGH" to selectively disable the compile check in Makefiles when checking is enabled. Change-Id: I4c054defda03daa2aeb645ae2271dfa0cb54925f Reviewed-on: https://gerrit.openafs.org/14275 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 16f1b2f894c28614df0f096be8232b1176e87c70 Author: Cheyenne Wills Date: Mon Jul 27 08:33:03 2020 -0600 clang-10: use AFS_FALLTHROUGH for case fallthrough Clang-10 will not recognize '/* fallthrough */' as an indicator to turn off the fallthrough diagnostic due to the lack of a 'break' in a case statement. Clang-10 requires the '__attribute__((fallthrough))' statement to disable the diagnostic. In addition clang-10 is finding additional locations where fall throughs occur. Determine if the compiler supports '__attribute__((fallthrough))' to disable the implicit fallthrough diagnostic. Define a new macro 'AFS_FALLTHROUGH' that will disable the fallthrough diagnostic. Set it as a wrapper for the Linux kernel's 'fallthrough' macro if available, otherwise set it as a wrapper macro for '__attribute__((fallthrough))' if the compiler supports it. Update CODING to document the use of AFS_FALLTHROUGH when needing to fallthrough between case statements. Replace the '/* fallthrough */' comments with AFS_FALLTHROUGH, and add AFS_FALLTHROUGH as needed. Replace some fallthroughs with a break (or goto) if the flow was was just to a break (or goto). e.g. case x: case x: somestmt; somestmt; break; case y: case y: break; break; Correct a mis-indented brace '}' in src/WINNT/afsd/smb3.c Note, the clang maintainers have rejected the use of comments as a flag to turn off the fall through warnings. Change-Id: Ia5da10fc14fc1874baca035a3cf471e618e0d5f5 Reviewed-on: https://gerrit.openafs.org/14274 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit e61ab9353e99d3298815296abf6b02c50ebe3df0 Author: Michael Meffie Date: Wed Jul 1 21:50:09 2020 -0400 redhat: Add make to the dkms-openafs pre-requirements If `make` is not installed before dkms-openafs, the OpenAFS kernel module is not built during the dkms-openafs package installation. The failure happens in the "checking if linux kernel module build works" configure step, which invokes `make` to check the linux buildsystem. configure fails when `make` is not available, and gives the unhelpful suggestion (in this case) of configuring with --disable-kernel module. Running the configure.log in the dkms build directory shows: configure:7739: checking if linux kernel module build works make -C /lib/modules/4.18.0-193.6.3.el8_2.x86_64/build M=/var/lib/dkms/openafs/... ./configure: line 7771: make: command not found configure: failed using Makefile: Avoid this build failure by adding `make` to the list of dkms-openafs package pre-requirements. Change-Id: I98b3508341eea1df4fa7b6f43e88add1bda9ee2c Reviewed-on: https://gerrit.openafs.org/14266 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 2d01f35d05a71da3594569c66e688b4bc6b28401 Author: Andrew Deason Date: Fri May 29 12:57:50 2020 -0500 vol: Blank opts in VOptDefaults Instead of needing to set every single field in the 'opts' structure individually, blank the whole thing to make sure the entire struct is initialized. Remove the now-redundant lines that initialize various items to 0. Change-Id: I799cdb55becd66a8f3d6ec2f81338843038d0abd Reviewed-on: https://gerrit.openafs.org/14280 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Kailas Zadbuke Reviewed-by: Yadavendra Yadav Reviewed-by: Benjamin Kaduk commit 4498bd8179e5e93a33468be3c8e7a30e569d560a Author: Andrew Deason Date: Mon Jun 22 22:54:52 2020 -0500 volser: Don't NUL-pad failed pread()s in dumps Currently, the volserver SAFSVolDump RPC and the 'voldump' utility handle short reads from pread() for vnode payloads by padding the missing data with NUL bytes. That is, if we request 4k of data for our pread() call, and we only get back 1k of data, we'll write 1k of data to the volume dump stream followed by 3k of NUL bytes, and log messages like this: 1 Volser: DumpFile: Error reading inode 1234 for vnode 5678 1 Volser: DumpFile: Null padding file: 3072 bytes at offset 40960 This can happen if we hit EOF on the underlying file sooner than expected, or if the OS just responds with fewer bytes than requested for any reason. The same code path tries to do the same NUL-padding if pread() returns an error (for example, EIO), padding the entire e.g. 4k block with NULs. However, in this case, the "padding" code often doesn't work as intended, because we compare 'n' (set to -1) with 'howMany' (set to 4k in this example), like so: if (n < howMany) Here, 'n' is signed (ssize_t), and 'howMany' is unsigned (size_t), and so compilers will promote 'n' to the unsigned type, causing this conditional to fail when n is -1. As a result, all of the relevant log messages are skipped, and the data in the dumpstream gets corrupted (we skip a block of data, and our 'howFar' offset goes back by 1). So this can result in rare silent data corruption in volume dumps, which can occur during volume releases, moves, etc. To fix all of this, remove this bizarre NUL-padding behavior in the volserver. Instead: - For actual errors from pread(), return an error, like we do for I/O errors in most other code paths. - For short reads, just write out the amount of data we actually read, and keep going. - For premature EOF, treat it like a pread() error, but log a slightly different message. For the 'voldump' utility, the padding behavior can make sense if a user is trying to recover volume data offline in a disaster recovery scenario. So for voldump, add a new switch (-pad-errors) to enable the padding behavior, but change the default behavior to bail out on errors. Change-Id: Ibd6e76c5ea0dea95e3354d9b34536296f81b4f67 Reviewed-on: https://gerrit.openafs.org/14255 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 37b55b30c65d0ab8c8eaabfda0dbd90829e2c46a Author: Cheyenne Wills Date: Thu Jul 16 15:52:00 2020 -0600 butc: fix int to float conversion warning Building with clang-10 results in 2 warnings/errors associated with with trying to convert 0x7fffffff to a floating point value. tcmain.c:240:18: error: implicit conversion from 'int' to 'float' changes value from 2147483647 to 2147483648 [-Werror, -Wimplicit-int-float-conversion] if ((total > 0x7fffffff) || (total < 0)) /* Don't go over 2G */ and the same conversion warning on the statement on the following line: total = 0x7fffffff; Use floating point and decimal constants instead of the hex constants. For the test, use 2147483648.0 which is cleanly represented by a float. Change the comparison in the test from '>' to '>='. If the total value exceeds 2G, just assign the max value directly to the return variable. Change-Id: I79b2afa006496a756bd7b50976050c24827aa027 Reviewed-on: https://gerrit.openafs.org/14277 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 899b1af4183fb09fd55a36e3d10ffbdb9671a47e Author: Cheyenne Wills Date: Thu Jul 16 15:07:15 2020 -0600 autoconf: fix detection for fallthrough attribute Due to bug , ax_gcc_func_attribute.m4 fails to properly detect __attribute__((fallthrough)) in clang. Until this is fixed in autoconf-archive upstream, fix our local copy of ax_gcc_func_attribute.m4, so we can detect __attribute__((fallthrough)) to make --enable-checking work with clang. Change-Id: I80a4557384f8e1438344e48bfe722e20c8773882 Reviewed-on: https://gerrit.openafs.org/14273 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 88da6b4dfa4ad2b53508f9e0b559392cecb69c86 Author: Cheyenne Wills Date: Thu Jul 16 15:05:13 2020 -0600 cf: Make local copy of ax_gcc_func_attribute.m4 Make a local copy of ax_gcc_func_attribute from autoconf-archive. This is needed in order to fix a bug in the detection of the fallthrough attribute. Remove ax_gcc_func_attribute.m4 from src/external/autoconf-archive/m4. Update LICENSE file to point to the local copy in src/cf. Change-Id: I6c4244d2cd4edab4262c1820435c00419d85303b Reviewed-on: https://gerrit.openafs.org/14272 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit bb5397e4c409e3c075ee73d6bf54a3b6eacc0060 Author: Mark Vitale Date: Fri Apr 20 00:57:28 2018 -0400 rx: prevent leakage of non-cached rx_connections (pthread) The rxi_connectionCache (AFS_PTHREAD_ENV only) allows applications to reuse rx_connection structs. Cached rx_connections are obtained via rx_GetCachedConnection and released via rx_ReleaseCachedConnection. This feature is used most heavily by libadmin and kauth, but there are other users in the tree as well. For instance, ubikclient routines ubik_ClientInit and ubik_ClientDestroy call rx_ReleaseCachedConnections (if AFS_PTHREAD_ENV) when disposing of their rx_connections. Unfortunately, in many cases these rx_connections were obtained via rx_NewConnection, _not_ from the cache via rx_GetCachedConnection. In those cases, rx_ReleaseCachedConnection will not find the rx_connection in the rxi_connectionCache, and thus it returns without doing anything. Therefore, when ubik_ClientInit is passed an existing ubik_client (for re-initialization) that contains rx_connections NOT allocated via rx_GetCachedConnection, those connections are not destroyed, but will be silently leaked. Similarly, ubik_ClientDestroy will leak its rx_connections when it frees the ubik_client struct. For example, the fileserver host package calls ubik_ClientInit (via hpr_Initialize) and ubik_ClientDestroy (via hpr_End) to manage connections to the ptserver. However, these connections were obtained via rx_NewConnection, not rx_GetCachedConnection. If the fileserver has a failed call to the ptserver that sets prfail=1, the next RPC scheduled for that client (in CallPreamble) will refresh the thread's ubik_client (viced_uclient_key) by calling hprEnd -> ubik_ClientDestroy -> rx_ReleaseCachedConnection. The "released" connections will be leaked. This problem exists in all versions of OpenAFS going back to IBM 1.0. Starting with 1.8.x, many components that were formerly LWP-only are now pthreaded and thus susceptible to this leak. It seems difficult and error-prone to identify all possible code paths that may pass a non-cached rx_connection to rx_ReleaseCachedConnection, and convert them to obtain connections via rx_GetCachedConnection. Instead, prevent all existing and future leaks by modifying the connection cache to: - flag all rx_connections it allocates - correctly release any rx_connection it is passed, whether they came from the cache or not. Change-Id: Ibe164ccd30a8ddd799438c28fd6e1d8a0a9040dd Reviewed-on: https://gerrit.openafs.org/13042 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 55fca11421055d0bcee79f118ea2a035393cc6e5 Author: Mark Vitale Date: Mon Apr 30 18:34:28 2018 -0400 rx: fix out-of-range value for RX_CONN_NAT_PING Commit 496fb87372555f6acddd4fd88b03c94c85f48511 ("rx: avoid nat ping until connection is attached") introduced functionality to defer turning on NAT ping for server connections until after reachability had been established for the client. Unfortunately, this feature could never work correctly because it assigned an out-of-range flag value of 256 (0x100) for the u_char flags field. Instead of calling this out as an error, both gcc and Solaris cc elide this flag so that it is never set in rx_SetConnSecondsUntilNatPing(), Furthermore, the test in rxi_ConnClearAttachWait() will always fail; therefore rxi_ScheduleNatKeepAliveEvent is never called after attach wait has ended. Fortunately, this bug is currently moot - not actually exposed in OpenAFS. (It was discovered by inspection). This is because there are currently no rx_connection objects in the tree that have both NAT ping and checkReach (rx_SetCheckReach) enabled. I also searched git history and found no time when this bug could ever have been exposed. This does raise the question of why the original commit was needed; but instead of reverting the original commit, this commit attempts to fix it. To prevent problems if NAT ping and checkReach are ever both enabled for an rx_connection, enlarge the rx_connection flags member so that the RX_CONN_NAT_PING value is no longer out of range. Change-Id: Ib667ece632f66fa5c63a76398acb3153fed6f9c3 Reviewed-on: https://gerrit.openafs.org/13041 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit d231134aadcaf2bd3a91f26ba6d3d451713a6fba Author: Andrew Deason Date: Mon May 18 12:38:31 2020 -0500 auth: Avoid cellconfig.c stdio renaming Since commit 35777145 (solaris-fopen-sucks-20060916), cellconfig.c has redirected fopen, fclose, and fgets to local functions on non-64bit-sparc Solaris, in order to work around that platform's stdio limitations. Commit 7c431f7571 (auth: retire writeconfig.c) moved the contents of writeconfig.c into cellconfig.c. The previous writeconfig.c contained some calls to stdio, including calling fprintf() on a pointer returned by fopen() in that file. Because fopen() was redirected to our local version, this means that afsconf_SetExtendedCellInfo() calls fopen() to get an afsconf_iobuffer*, and passes that pointer to the real system fprintf() later on (instead of a native FILE*). The compiler does warn about this, but this only happens on Solaris, where --enable-checking is not implemented, so the build never fails. To avoid this, remove the #defines for fopen, fgets, and fclose. Instead, change all of the old cellconfig.c callers to explicitly call afsconf_fopen, afsconf_fgets, and afsconf_fclose. On the affected Solaris platforms, we keep our local definitions, and for other platforms, we just make those functions call their system stdio equivalents. For the code that was pulled in from writeconfig.c, callers will just call the system fopen, fprintf, and fclose. We still keep our local afsconf_FILE* definition on all platforms, so the compiler will still do typechecking for our local afsconf_f* functions on all platforms. So now if we make a mistake, it should be a mistake on all platforms, so platforms with --enable-checking should flag the error. Change-Id: I4064d7f5ee82d5acab04a33b01c0603564a391e8 Reviewed-on: https://gerrit.openafs.org/14214 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Mark Vitale Tested-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit cd65475e95e25c8e7071e099a682bdcc03d2cce1 Author: Andrew Deason Date: Fri Jul 26 15:28:44 2019 -0500 afs: Let afs_ShakeLooseVCaches run longer Currently, when afs_ShakeLooseVCaches runs osi_TryEvictVCache, we check if osi_TryEvictVCache slept (i.e. dropped afs_xvcache/GLOCK). If we sleep over 100 times, then we stop trying to evict vcaches and return. If we have recently accessed a lot of AFS files, this limitation can severely reduce our ability to keep our number of vcaches limited to a reasonable size. For example: Say a Linux client runs a process that quickly accesses 1 million files (a simple 'find' command) and then does nothing else. A few minutes later, afs_ShakeLooseVCaches is run, but since all of the newly accessed vcaches have dentries attached to them, we will sleep on each one in order to try to prune the attached dentries. This means that afs_ShakeLooseVCaches will evict 100 vcaches, and then return, leaving us with still almost 1 million vcaches. This will happen repeatedly until afs_ShakeLooseVCaches finally works its way through all of the vcaches (which takes quite a while, if we only clear 100 at once), or the dentries get pruned by other means (such as, if Linux evicts them due to memory pressure). The limit of 100 sleeps was originally added in commit 29277d96 (newvcache-dont-spin-20060128), but the current effect of it was largely introduced in commit 9be76c0d (Refactor afs_NewVCache). It exists to ensure that afs_ShakeLooseVCaches doesn't take forever to run, but the limit of 100 sleeps may seem quite low, especially if those 100 sleeps run very quickly. To avoid the situation described above, instead of limiting afs_ShakeLooseVCaches based on a fixed number of sleeps, limit it based on how long we've been running, and set an arbitrary limit of roughly 3 seconds. Only check how long we've been running after 100 sleeps like before, so we're not constantly checking the time while running. Log a new warning if we exit afs_ShakeLooseVCaches prematurely if we've been running for too long, to help indicate what is going on. Change-Id: I65729ace748e8507cc0d5c26dec39e74d7bff5d2 Reviewed-on: https://gerrit.openafs.org/14254 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 9ff45e73cf3d91d12f09e108e1267e37ae842c87 Author: Andrew Deason Date: Mon Jul 16 16:53:34 2018 -0500 afs: Skip bulkstat if stat cache looks full Currently, afs_lookup() will try to prefetch dir entries for normal dirs via bulkstat whenever multiple pids are reading that dir. However, if we already have a lot of vcaches, ShakeLooseVCaches may be struggling to limit the vcaches we already have. Entering afs_DoBulkStat can make this worse, since we grab afs_xvcache repeatedly, we may kick out other vcaches, and we'll possibly create 30 new vcaches that may not even be used before they're evicted. To try to avoid this, skip running afs_DoBulkStat if it looks like the stat cache is really full. Change-Id: I1634530170a189f32cb962dd7df28f88bc758b71 Reviewed-on: https://gerrit.openafs.org/13256 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 0532f917f29bdb44f4933f9c8a6c05c7fecc6bbb Author: Andrew Deason Date: Mon Jul 16 16:44:14 2018 -0500 afs: Log warning when we detect too many vcaches Currently, afs_ShakeLooseVCaches has a kind of warning that is logged when we fail to free up any vcaches. This information can be useful to know, since it may be a sign that users are trying to access way more files than our configured vcache limit, hindering performance as we constantly try to evict and re-create vcaches for files. However, the current warning is not clear at all to non-expert users, and it can only occur for non-dynamic vcaches (which is uncommon these days). To improve this, try to make a general determination if it looks like the stat cache is "stressed", and log a message if so after afs_ShakeLooseVCaches runs (for all platforms, regardless of dynamic vcaches). Also try to make the message a little more user-friendly, and only log it (at most) once per 4 hours. Determining whether the stat cache looks stressed or not is difficult and arguably subjective (especially for dynamic vcaches). This commit draws a few arbitrary lines in the sand to make the decision, so at least something will be logged in the cases where users are constantly accessing way more files than our configured vcache limit. Change-Id: I022478dc8abb7fdef24ccc06d477b349cca759ac Reviewed-on: https://gerrit.openafs.org/13255 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 42fb8786a8fff30ea97524f896c5aee4fa307f89 Author: Mark Vitale Date: Thu Jun 25 11:45:19 2020 -0400 viced: propagate return from CleanupTimedOutCallBacks_r The fileserver's FiveMinuteCheckLWP periodically calls CleanupTimedOutCallBacks, and logs an informational messages if the return code indicates that any callbacks were discarded. However, since the original IBM code import, CleanupTimedOutCallBacks has 1) ignored the return value from CleanupTimedOutCallBacks_r and 2) unconditionally returned 0. This makes the informational message essentially dead code. Instead, check the code from CleanupTimedOutCallBacks_r and pass it back to the caller. Change-Id: I631831c398e43431b79f4a3a0c6f01307ac0c05e Reviewed-on: https://gerrit.openafs.org/14256 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit f9d20c631d7280ce00125a1208331931a6e3f31c Author: Andrew Deason Date: Thu Jun 18 21:16:09 2020 -0500 LINUX: Close cacheFp if no ->readpage in fastpath In afs_linux_readpage_fastpath, if we discover that our disk cache fs has no ->readpage function, we'll 'goto out', but we never close our cacheFp. To make sure we close it, add a filp_close() call to the 'goto out' cleanup code. Change-Id: I371c1d7ec51b03447fbcbe58fb89be7be0235022 Reviewed-on: https://gerrit.openafs.org/14252 Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit af73b9a3b1fc625694807287c0897391feaad52d Author: Cheyenne Wills Date: Thu Jul 2 13:39:27 2020 -0600 LINUX: Don't panic on some file open errors Commit 'LINUX: Return NULL for afs_linux_raw_open error' (f6af4a155) updated afs_linux_raw_open to return NULL on some errors, but still panics if obtaining the dentry fails. Commit 'afs: Verify osi_UFSOpen worked' (c6b61a451) updated callers of osi_UFSOpen to verify whether or not the open was successful. This meant osi_UFSOpen (and routines it calls) could pass back an error indication rather than panic when an error is encountered. Update afs_linux_raw_open to return a failure instead of panic if unable to obtain a dentry. Update osi_UFSOpen to return a NULL instead of panic if unable to obtain memory or fails to open the file. All callers of osi_UFSOpen handle a fail return, though some will still issue a panic. Update afs_linux_readpage_fastpath and afs_linux_readpages to not panic if afs_linux_raw_open fails. Instead of panic, return an error. For testing, an error can be forced by removing a file from the cache directory. Note this work is based on a commit by pruiter@sinenomine.net Change-Id: Ic47e4868b4f81d99fbe3b2e4958778508ae4851f Reviewed-on: https://gerrit.openafs.org/14242 Reviewed-by: Andrew Deason Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit d2d27f975df13c3833898611dacff940a5ba3e2a Author: Cheyenne Wills Date: Fri Jun 19 08:01:14 2020 -0600 afs: Avoid panics on failed return from afs_CFileOpen afs_CFileOpen is a macro that invokes the open "method" of the afs_cacheOps structure, and for disk caches the osi_UFSOpen function is used. Currently osi_UFSOpen will panic if there is an error encountered while opening a file. Prepare to handle osi_UFSOpen function returning a NULL instead of issuing a panic (future commit). Update callers of afs_CFileOpen to test for an error and to return an error instead of issuing a panic. While this commit eliminates some panics, it does not address some of the more complex cases associated with errors from afs_CFileOpen. Change-Id: I2bdd525633dd44ebf8e26fcfd7059dfdfffb6142 Reviewed-on: https://gerrit.openafs.org/14241 Reviewed-by: Andrew Deason Reviewed-by: Mark Vitale Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 7d85ce221d6ccc19cf76ce7680c74311e4ed2632 Author: Cheyenne Wills Date: Thu Jun 25 10:43:53 2020 -0600 LINUX 5.8: use lru_cache_add With Linux-5.8-rc1 commit 'mm: fold and remove lru_cache_add_anon() and lru_cache_add_file()' (6058eaec), the lru_cache_add_file function is removed since it was functionally equivalent to lru_cache_add. Replace lru_cache_add_file with lru_cache_add. Introduce a new autoconf test to determine if lru_cache_add is present For reference, the Linux changes associated with the lru caches: __pagevec_lru_add introduced before v2.6.12-rc2 lru_cache_add_file introduced in v2.6.28-rc1 __pagevec_lru_add_file replaces __pagevec_lru_add in v2.6.28-rc1 vmscan: split LRU lists into anon & file sets (4f98a2fee) __pagevec_lru_add removed in v5.7 with a note to use lru_cache_add_file mm/swap.c: not necessary to export __pagevec_lru_add() (bde07cfc6) lru_cache_add_file removed in v5.8 mm: fold and remove lru_cache_add_anon() and lru_cache_add_file() (6058eaec) lru_cache_add exported mm: fold and remove lru_cache_add_anon() and lru_cache_add_file() (6058eaec) Openafs will use: lru_cache_add on 5.8 kernels lru_cache_add_file from 2.6.28 through 5.7 kernels __pagevec_lru_add/__pagevec_lru_add_file on pre 2.6.28 kernels Change-Id: I79ebe4a81425bf8a8a327ddf2d3474aff9df039d Reviewed-on: https://gerrit.openafs.org/14249 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Yadavendra Yadav Reviewed-by: Benjamin Kaduk commit ae9ea8da699ba3f2ab0f7d76ae3333349fe3dfa3 Author: Benjamin Kaduk Date: Tue Jun 30 21:55:45 2020 -0700 Recode a couple files from ISO 8859-1 to UTF-8 Reported by Debian's lintian(1). The CellServDB, as an externally maintained file, is left unchanged. Change-Id: I3bf241b924cb8cd7799a4c3e799f6acd375b2e8a Reviewed-on: https://gerrit.openafs.org/14265 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit ba8b92401b8cb2f5a5306313c2702cb36cba083c Author: Andrew Deason Date: Sun Jul 8 15:00:02 2018 -0500 afs: Bound afs_DoBulkStat dir scan Currently, afs_DoBulkStat will scan the entire directory blob, looking for entries to stat. If all or almost all entries are already stat'd, we'll scan through the entire directory, doing nontrivial work on each entry (we grab afs_xvcache, at least). All of this work is pretty pointless, since the entries are already cached and so we won't do anything. If many processes are trying to acquire afs_xvcache, this can contribute to performance issues. To avoid this, provide a constant bound on the number of entries we'll search through: nentries * 4. The current arbitrary limits cap nentries at 30, so this means we're capping the afs_DoBulkStat search to 120 entries. Change-Id: I66e9af5b27844ddf6cf37c8286fcc65f8e0d3f96 Reviewed-on: https://gerrit.openafs.org/13253 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 6c808e05adb0609e02cd61e3c6c4c09eb93c1630 Author: Andrew Deason Date: Thu Jul 13 17:40:36 2017 -0500 afs: Avoid needless W-locks for afs_FindVCache The callers of afs_FindVCache must hold at least a read lock on afs_xvcache; some hold a shared or write lock (and set IS_SLOCK or IS_WLOCK in the given flags). Two callers (afs_EvalFakeStat_int and afs_DoBulkStat) currently hold a write lock, but neither of them need to. In the optimal case, where afs_FindVCache finds the given vcache, this means that we unnecessarily hold a write lock on afs_xvcache. This can impact performance, since afs_xvcache can be a very frequently accessed lock (a simple operation like afs_PutVCache briefly holds a read lock, for example). To avoid this, have afs_DoBulkStat hold a shared lock on afs_xvcache, upgrading to a write lock when needed. afs_EvalFakeStat_int doesn't ever need a write lock at all, so just convert it to a read lock. Change-Id: I5bd58b9e3a577c9e1ebf1bc3719e65a6c0af5cb8 Reviewed-on: https://gerrit.openafs.org/12656 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit e44d6441c8786fdaaa1fad1b1ae77704c12f7d60 Author: Kailas Zadbuke Date: Wed Jun 3 15:44:08 2020 +0530 util: Handle serverLogMutex lock across forks If a process forks when another thread has serverLogMutex locked, the child process inherits the locked serverLogMutex. This causes a deadlock when code in the child process tries to lock serverLogMutex, since we can never unlock serverLogMutex because the locking thread no longer exists. This can happen in the salvageserver, since the salvageserver locks serverLogMutex in different threads, and forks to handle salvage jobs. To avoid this deadlock, we register handlers using pthread_atfork() so that the serverLogMutex will be held during the fork. The fork will be blocked until the worker thread releases the serverLogMutex. Hence the serverLogMutex will be held until the fork is complete and it will be released in the parent and child threads. Thanks to Yadavendra Yadav(yadayada@in.ibm.com) for working with me on this issue. Change-Id: I191c8272825c1667bb2150146e04b1dfe36a54e4 Reviewed-on: https://gerrit.openafs.org/14239 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 19cd454f11997d286bc415e9bc9318a31f73e2c6 Author: Andrew Deason Date: Mon Jul 16 16:08:13 2018 -0500 afs: Split out bulkstat conditions into a function Our current if() statement for determining whether we should run afs_DoBulkStat to prefetch dir entries is a bit large, and grows over time. Split this logic out into a separate function to make it easier to maintain, and add some comments to help explain each condition. This commit should have no visible effects; it's just code reorganization. Change-Id: I0086189308d2f5e4b321c63f24110d74cda6433c Reviewed-on: https://gerrit.openafs.org/13254 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit a05d5b7503e466e18f5157006c1de2a2f7d019f7 Author: Andrew Deason Date: Thu Jul 13 17:40:21 2017 -0500 afs: Change VerifyVCache2 calls to VerifyVCache afs_VerifyVCache is a macro that (on most platforms) effectively expands to: if ((avc->f.states & CStatd)) { return 0; } else { return afs_VerifyVCache2(...); } Some callers call afs_VerifyVCache2 directly, since they already check for CStatd for other reasons. A few callers currently call afs_VerifyVCache2, but without guaranteeing that CStatd is not set. Specifically, in afs_getattr and afs_linux_VerifyVCache, CStatd could be set while afs_CreateReq drops GLOCK. And in afs_linux_readdir, CStatd could be cleared at multiple different points before the VerifyVCache call. This can result in afs_VerifyVCache2 acquiring a write-lock on the vcache, even when CStatd is already set, which is an unnecessary performance hit. To avoid this, change these call sites to use afs_VerifyVCache instead of calling afs_VerifyVCache2 directly, which skips the write lock when CStatd is already set. Change-Id: I7b75c9755af147b42a48160fa90c9849f2f03ddb Reviewed-on: https://gerrit.openafs.org/12655 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 7c9fb4455745ed0015d4a6311bd4a7770efbf40d Author: Mark Vitale Date: Thu Jun 18 13:43:35 2020 -0400 LINUX: replace BUG() call with osi_Panic() in osi_linux_free If osi_linux_free fails, it printf's an error message, then calls BUG(). This is the sole open-coded call to BUG() in OpenAFS; all other calls to BUG() are indirect via osi_Panic(). For consistency, eliminate this direct BUG() call by replacing the printf and BUG() with an equivalent osi_Panic(). This also ensures that the error messsage is logged as critical, and prefixed with "openafs:". Change-Id: Id319dffa859308528a66991bbbc522ca49552d51 Reviewed-on: https://gerrit.openafs.org/14250 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit d8ec294534fcdee77a2ccd297b4b167dc4d5573d Author: Cheyenne Wills Date: Tue Jun 16 18:35:46 2020 -0600 LINUX 5.8: do not set name field in backing_dev_info Linux-5.8-rc1 commit 'bdi: remove the name field in struct backing_dev_info' (1cd925d5838) Do not set the name field in the backing_dev_info structure if it is not available. Uses an existing config test 'STRUCT_BACKING_DEV_INFO_HAS_NAME' Note the name field in the backing_dev_info structure was added in Linux-2.6.32 Change-Id: I20b80e49e8a15a2949003101f24d9ce39f63b59b Reviewed-on: https://gerrit.openafs.org/14248 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit c48072b9800759ef1682b91ff1e962f6904a2594 Author: Cheyenne Wills Date: Thu Jun 18 16:39:22 2020 -0600 LINUX 5.8: Replace kernel_setsockopt with new funcs Linux 5.8-rc1 commit 'net: remove kernel_setsockopt' (5a892ff2facb) retires the kernel_setsockopt function. In prior kernel commits new functions (ip_sock_set_*) were added to replace the specific functions performed by kernel_setsockopt. Define new config test 'HAVE_IP_SOCK_SET' if the 'ip_sock_set' functions are available. The config define 'HAVE_KERNEL_SETSOCKOPT' is no longer set in Linux 5.8. Create wrapper functions that replace the kernel_setsockopt calls with calls to the appropriate Linux kernel function(s) (depending on what functions the kernel supports). Remove the unused 'kernel_getsockopt' function (used for building with pre 2.6.19 kernels). For reference Linux 2.6.19 introduced kernel_setsockopt Linux 5.8 removed kernel_setsockopt and replaced the functionality with a set of new functions (ip_sock_set_*) Change-Id: I517b674303c5decc19313d9de51d04ddef36b421 Reviewed-on: https://gerrit.openafs.org/14247 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit cbc5c4b51fcd0a990216fc31abe308a9e85fd9df Author: Andrew Deason Date: Wed Jun 17 12:23:46 2020 -0500 tests: Modernize writekeyfile.c tests/auth/writekeyfile.c contains some code used to generate tests/auth/KeyFile, which is used to test code interpreting the old-style KeyFile format. This code currently has a few problems: - We don't check the results of afstest_mkdtemp, which could allow symlink attacks from other users on the system. - We duplicate some logic from afstest_BuildTestConfig, in order to build a temporary config dir. - writekeyfile isn't built or run by default (it only exists to generate KeyFile, so it's almost never run), so eventual bitrot is quite likely, and the existing code already generates warnings. To avoid this, change writekeyfile.c to use the existing afstest_BuildTestConfig to generate a local config dir. To ensure we avoid bitrot, build writekeyfile by default, and create a test to run it, to make sure it can generate a KeyFile as expected. Note that the KeyFile.short we test against is different than the KeyFile currently in the tree. The existing KeyFile was generated from an older OpenAFS release, which always generated 100-byte KeyFiles, even if we only have a few keys. The current codebase only writes out as much key data as needed, so the generated KeyFiles are shorter (but still understandable by older OpenAFS releases). Keep the old 100-byte KeyFile around, since that's what older OpenAFS would generate, and create a new KeyFile.short to test against, to make sure our code for generating KeyFiles doesn't change any further. Change-Id: Ibe9246c6dd808ed2b2225dd7be2b27bbdee072fd Reviewed-on: https://gerrit.openafs.org/14246 Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 22a66e7b7e1d73437a8c26c2a1b45bc4ef214e77 Author: Cheyenne Wills Date: Tue Jun 16 15:20:20 2020 -0600 tests: Use usleep instead of nanosleep Commit "Build tests by default" 68f406436cc21853ff854c514353e7eb607cb6cb changes the build so tests are always built. On Solaris 10 the build fails because nanosleep is in librt, which we do not link against. Replace nanosleep with usleep. This avoids introducing extra configure tests just for Solaris 10. Note that with Solaris 11 nanosleep was moved from librt to libc, the standard C library. Change-Id: I6639f32bb8c8ace438e0092a866f06561dad54f1 Reviewed-on: https://gerrit.openafs.org/14244 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 5f4a681eeb5e353f09aa895770f7336a2b381467 Author: Cheyenne Wills Date: Wed Jun 17 13:08:18 2020 -0600 tests: Emulate mkdtemp when not available Commit "Build tests by default" 68f406436cc21853ff854c514353e7eb607cb6cb changes the build so tests are always built. On Solaris 10 Update 10 and earlier the build fails because the mkdtemp function is not available. Introduce a wrapper 'afstest_mkdtemp' that uses mkdtemp if available, otherwise uses mktemp/mkdir. Change-Id: I0118f838ed9a89927e2ddac4cad822574601558a Reviewed-on: https://gerrit.openafs.org/14243 Reviewed-by: Andrew Deason Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 188ca8bf5276084a6892e5cfba3e24e478804382 Author: Michael Meffie Date: Thu Apr 16 09:41:41 2020 -0400 make-release: Run git describe once Run git describe once at the beginning of make-release to find the version information used to derive the tarball file names and saved in the .version file. This is a cleanup and refactoring change to prepare for a future commit. Change-Id: I0debeeffa5d2c63ab1498588766cb36424d15cd5 Reviewed-on: https://gerrit.openafs.org/14150 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit d0753c0ace8e43a7dc1db35c3f41130352278c04 Author: Michael Meffie Date: Fri Mar 27 11:29:24 2020 -0400 make-release: Create output directory if needed Automatically create the --dir directory if it does not already exist, which makes this script slightly easier to use. Remove the now uneeded mkdir from the top-level makefile. Change-Id: I1f4561120a70263b0b2b194e65fec55fb5666f40 Reviewed-on: https://gerrit.openafs.org/14115 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit d20d392091a13c3944973bcb0ce84783a4e0d179 Author: Michael Meffie Date: Thu Apr 16 07:21:51 2020 -0400 make-release: Remove unused optional version argument The make-release help shows an optional version argument, but in fact the version info is always generated from the git tag name argument, which makes sense when creating releases. Continue to throw away the second positional argument just in case someone is still passing a second argument, but issue a warning if they do. Change-Id: Ie4c6e6efb7693e53a02fd009eecd64b47250c848 Reviewed-on: https://gerrit.openafs.org/14149 Reviewed-by: Cheyenne Wills Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 46eb00ffa1c6d7deda2c1b1b4fa1780b36e64417 Author: Michael Meffie Date: Thu Apr 16 07:37:39 2020 -0400 make-release: Clean up whitespace and spelling Fix whitespace errors, convert tabs to spaces, fix spelling errors, and fix pod markup in the make-release script. Change-Id: I24ede59d44a8818d89de454c0935586fccbd5d9a Reviewed-on: https://gerrit.openafs.org/14148 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit c9eab4b1ee947067bfcc3678bb89896b66f404f8 Author: Andrew Deason Date: Tue Jun 2 11:12:58 2020 -0500 afs: Remove osi_GetuTime osi_GetuTime has always been #define'd to be the same thing as osi_GetTime, ever since OpenAFS 1.0. Get rid of this redundant macro, and just use osi_GetTime instead. Change-Id: Ic826aeaa17314019b79cfb2df04a79309aa31db5 Reviewed-on: https://gerrit.openafs.org/14236 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit dedb1aed97e64036d8098e12904c9eb54fda7010 Author: Jeffrey Altman Date: Sun May 31 13:05:02 2020 -0400 afs/viced: New UAE (unified_afs) error codes The following registrations werte submitted to registrar@central.org as [rt.central.org #135105]. UAECANCELED, "Operation canceled" (49733499L) UAENOTRECOVERABLE, "State not recoverable" (49733500L) UAENOTSUP, "Not supported" (49733501L) UAEOTHER, "Other" (49733502L) UAEOWNERDEAD, "Owner dead" (49733503L) UAEPROCLIM, "Too many processes" (49733504L) UAEDISCON, "Graceful shutdown in progress" (49733505L) Change-Id: I1458b8a9441b3826756ca67af70eee5e835d989f Reviewed-on: https://gerrit.openafs.org/14235 Reviewed-by: Jeffrey Hutzelman Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit ed9a3b7165ae2300ebb185ca53e698e5ef93173b Author: Cheyenne Wills Date: Fri May 29 10:36:13 2020 -0600 util: Fix segfault in the func ConstructLocalPath The function ConstructLocalPath will segfault if passed a NULL for the command path parameter. Update ConstructLocalPath to test the passed command path for a NULL and return ENOENT. The segfault can be triggered by setting up a BosConfig with a dafs bnode that does not contain all the required parms. This setup results in bosserver segfaulting. With the fix, bosserver now logs an error and exits cleanly. Change-Id: I26015c8accd829f3101b073964777b41d16b07f7 Reviewed-on: https://gerrit.openafs.org/14223 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 336f5d91c6f4e93f77560d456fb29fbd82b237e5 Author: Mark Vitale Date: Sun May 10 20:53:22 2020 -0400 DARWIN: ensure OpenAFS.pkg is signed Installation fails because the OpenAFS.pkg was inadvertently omitted from the codesign logic. Ensure that the package is signed. Change-Id: I0745146bc523750912dd6ee95fc16a70572be175 Reviewed-on: https://gerrit.openafs.org/14221 Reviewed-by: Marcio Brito Barbosa Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit d3f8d8122880de9f5b25868b39efd1cc7d385ff6 Author: Mark Vitale Date: Sun May 10 20:51:59 2020 -0400 DARWIN: ensure PrefPane materials are properly signed Notarization fails because some prefPane materials were inadvertently omitted by the codesign logic. Ensure that these objects are properly signed. Change-Id: Ifc58e6f834a3237b7991257ee85de4e90fc3da12 Reviewed-on: https://gerrit.openafs.org/14220 Reviewed-by: Marcio Brito Barbosa Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 80afdc2adabb098394e1b2178ba301964868befe Author: Andrew Deason Date: Fri Dec 20 21:02:45 2019 -0600 vol: Avoid building devname.c on AFS_NAMEI_ENV Everything in devname.c is for the inode vol backend, so skip building it when AFS_NAMEI_ENV is defined. While we're doing this, alter the #ifdefs inside this file to assume that we're not on XBSD, DARWIN, or LINUX, since those platforms are all namei-only. Change-Id: I3a46568940e1a865a381c1ac7e98aea94df9f3ef Reviewed-on: https://gerrit.openafs.org/13995 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 99eedfdb1659dd48d12542ad063d4711d401e153 Author: Andrew Deason Date: Fri Dec 20 21:01:13 2019 -0600 vol: Indent ifdef maze in devname.c Change-Id: I371eb1d79ae9fb3f07af993be834af6f6b59c100 Reviewed-on: https://gerrit.openafs.org/13994 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 71ce9fff8e682a77e17490a54e091656cbf96925 Author: Tim Creech Date: Mon Dec 9 21:13:58 2019 -0500 FBSD: Add support for FreeBSD 12.1 Change-Id: I5779c586b6b1255de0ee0dea66b09f3a5dffddc1 Reviewed-on: https://gerrit.openafs.org/13982 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 20dc2832268eb81d40e798da0d424c98cf26062c Author: Andrew Deason Date: Sun Nov 24 22:36:17 2019 -0600 FBSD: Ignore VI_DOOMED vnodes Currently on FreeBSD, osi_TryEvictVCache calls vgone() for our vnode after checking if the given vcache is in use. vgone() then calls our VOP_RECLAIM operation, which calls afs_vop_reclaim, which calls afs_FlushVCache to finally actually flush the vcache. The current approach has at least the following major issues: - In afs_vop_reclaim, we return success even if afs_FlushVCache() fails. This allows FreeBSD to reuse the vnode for another file, but the vnode is still being referenced by our vcache, which is referenced by the global VLRU and various other structures. This causes all kinds of weird errors, since we try to use the underlying vnode for different files. - After the relevant checks in osi_TryEvictVCache are done, another thread can acquire a new reference to our vcache (this can happen while vgone() is running up until the vnode is locked). This new reference will cause afs_FlushVCache to fail. - Our afs_vop_reclaim callback is called while the vnode is locked, and can acquire afs_xvcache. Other code locks the vnode while afs_xvcache is already held (such as afs_PutVCache -> vrele). This can lead to deadlocks if two threads try to run these codepaths for the same vnode at the same time. - afs_vop_reclaim optionally acquires afs_xvcache based on the return value of CheckLock(&afs_xvcache). However, CheckLock just returns if that lock is locked by anyone, not if the current thread holds the lock. This can result in the rest of the function running without afs_xvcache actually being held if we drop AFS_GLOCK at any point. - osi_TryEvictVCache() tries to vn_lock() the target vnode, but we may already have another vnode locked in the current thread. If the vnode we're trying to evict is a descendant of a vnode we already have locked, this can deadlock. To fix these issues, make some changes to how our vcache management works on FreeBSD: - Do not allow anyone to hold a new reference on a VI_DOOMED vnode. We do this by checking for VI_DOOMED in osi_vnhold, and returning an error if VI_DOOMED is set. - In afs_vop_reclaim, panic if afs_FlushVCache fails. With the new VI_DOOMED check, afs_FlushVCache show now never fail; and if it somehow does, panic'ing immediately is better than corrupting various structures and panic'ing later on. - Move around some of the relevant locking in afs_vop_reclaim to fix the lock-related issues. - In osi_TryEvictVCache, don't wait for the vnode lock (LK_NOWAIT); treat the vnode as "in use" if we can't immediately obtain the lock. Thanks to tcreech@tcreech.com and kaduk@mit.edu for insight and help investigating the relevant issues. FIXES 135041 Change-Id: I23e94ecebbddc8c68a8f4ea918d64efd0f9f9dfd Reviewed-on: https://gerrit.openafs.org/13972 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 145c90bdbeeff4ea95acacd7dc110f0c6fcba281 Author: Mark Vitale Date: Sun May 10 22:13:13 2020 -0400 DARWIN: remove vestigial etap_event_t typedefs These typedefs have been present since commit a41175cfbbf4d06ccfe14ae54bef8b7464ecd80b "initial-darwin-support-20010327"; at least some of this material was obtained directly from IBM after the initial code import. Based on research of old Darwin source code and kernel documentation, the Event Trace Analysis Package (ETAP) was a lock-profiling interface provided in older versions of Mach and xnu. ETAP was not enabled by default; the kernel had to be recompiled with certain options to enable it. Support for ETAP was removed from the xnu tree sometime between xnu-517 (10.3 Panther) and xnu-792 (10.4 Tiger), although some references remain in the latter under PPC support (osfmk/ppc/hw_lock.s). All remaining references to etap_event_t disappeared when PPC support was removed, some time between xnu-1456.1.26 (10.6 Snow Leopard) and xnu-1699.24.8 (10.7.2 Lion). Therefore, it is possible that these typedefs were needed in the past by (IBM/Transarc) AFS to support use of some lock APIs (e.g., simple_lock_init, usimple_lock_init) after the ETAP code was withdrawn from xnu. However, these typedefs have probably always been vestigial for OpenAFS, because OpenAFS has never used any lock API that took etap_event_t as an argument. Regardless, OpenAFS does not need these definitions to build and run on any currently supported version of macOS. Remove the vestigial code. No functional change should be incurred by this commit. Change-Id: I39b3f82a8933d15ef5b5de5eb92366c0a31f8bb6 Reviewed-on: https://gerrit.openafs.org/14219 Tested-by: BuildBot Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk commit f065706fed4edd53376a33339fe20de686eee6a1 Author: Mark Vitale Date: Sun May 10 22:07:39 2020 -0400 DARWIN: remove errant typedef for etap_event_t This code has been dead since its introduction, because XAFS_DARWIN_ENV is a typo for AFS_DARWIN_ENV. Introduced from day 1 of DARWIN support with commit a41175cfbbf4d06ccfe14ae54bef8b7464ecd80b "initial-darwin-support-20010327". No functional change should be incurred by this commit. Change-Id: I6b74f01b4dd1230559ac8d75f0644071357f38b7 Reviewed-on: https://gerrit.openafs.org/14218 Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit c6eff25be9fc959f666b33425c9ee2635224826e Author: Mark Vitale Date: Mon May 18 14:19:25 2020 -0400 Convert all osi_timeval_t to osi_timeval32_t Since commit 130144850c6d05bc69e06257a5d7219eb98697d8 "xstat: cm xstat time values are 32 bit", OpenAFS has had two timeval definitions: osi_timeval_t and osi_timeval32_t. Since they are functionally equivalent, convert all references to osi_timeval_t to osi_timeval32_t. This makes clear that this struct is always expected to contain 32-bit members for tv_sec and tv_usec. There are still a few platforms where osi_timeval32_t is mistakenly defined with 64-bit members; these will be addressed in future commits. No functional change should be incurred by this commit. Change-Id: I3e8e44235e813571723fcd114194f6cb83de90e4 Reviewed-on: https://gerrit.openafs.org/14215 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit d6101128664918e6fcefbaeb68c4c1d439851411 Author: Mark Vitale Date: Mon May 4 17:35:05 2020 -0400 UKERNEL: remove dead code osi_SetTime osi_SetTime has been dead code since the original IBM code import. Remove it from the tree. No functional change is incurred by this commit. Change-Id: I25612a044ad550d798003979afc6845e502ebe3b Reviewed-on: https://gerrit.openafs.org/14191 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 03f44172180563cb9d12d79e5512aae815fee899 Author: Mark Vitale Date: Tue May 5 11:26:00 2020 -0400 UKERNEL: remove redundant declaration of osi_GetTime Commit c861bb0d779b54236b63eda87d9dfaf7792d1659 "Additional UKERNEL headers, prototyping and other fixes" added the following lines to src/rx/rx_prototypes.h: #if defined(UKERNEL) && !defined(osi_GetTime) extern int osi_GetTime(struct timeval *tv); #endif However, this appears to be redundant with the declaration in src/afs/afs_prototypes.h: #ifdef UKERNEL ... extern int osi_GetTime(struct timeval *tv); ... #endif which was added much earlier with commit 8f2df21ffe59e9aa66219bf24656775b584c122d "pull-prototypes-to-head-20020821". Remove the redundant declaration in rx/rx_prototypes.h. No functional change is incurrred by this commit. Change-Id: I2032d302e862eed47250357e604cba4f26e89814 Reviewed-on: https://gerrit.openafs.org/14192 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 3ab022fda9d2bde603c032d4a5bff0f79e825f3d Author: Mark Vitale Date: Thu Apr 16 09:02:00 2020 -0400 afs: remove commented xstats externs Extern declarations for the xstats recording areas have been commented out since 8f2df21ffe59e9aa66219bf24656775b584c122d "pull-prototypes-to-head-20020821". Remove the vestigial comments. No functional change is incurred by this commit. Change-Id: Ieef9a4b21e78db8d5427bed7b621ba043663b1d1 Reviewed-on: https://gerrit.openafs.org/14197 Reviewed-by: Benjamin Kaduk Reviewed-by: Andrew Deason Tested-by: BuildBot commit 4caadf71f556f789bcdd2bcc80b9642630329421 Author: Mark Vitale Date: Sun Apr 5 17:10:42 2020 -0400 afs: remove stats dead code afs_GetCMSTats, afs_AddToMean, and macro AFS_MEANCNT have been dead code since the original IBM code import. Remove them from the tree. No functional change is incurred by this commit. Change-Id: Icd6aeff7896d69a4d334531b5e0c632d807457ce Reviewed-on: https://gerrit.openafs.org/14196 Reviewed-by: Benjamin Kaduk Reviewed-by: Andrew Deason Tested-by: BuildBot commit 9a5790cfbb8e7b1a4a2e832911c71da49f604c20 Author: Mark Vitale Date: Mon May 18 17:20:26 2020 -0400 LINUX 5.6: define osi_timeval32_t for 32-bit Linux For 32-bit Linux (e.g., arch i586), AFS_LINUX_64BIT_KERNEL is not defined, so osi_timeval32_t is defined as a typedef of the native 'timeval'. However, as of commit c766d1472c70d25ad475cf56042af1652e792b23 "y2038: hide timeval/timespec/itimerval/itimerspec types" (Linux 5.6), the native timeval struct is no longer available. On such a kernel, the OpenAFS build will fail because osi_timeval32_t is not properly defined. Instead, add new conditionals to properly define osi_timeval32_t for this platform. Change-Id: I1eddeeb3651dcd3c55920ab1d2ad2838f4729bdd Reviewed-on: https://gerrit.openafs.org/14216 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 13e44b2b200cd99d0df4e03cf6413d3a6915783f Author: Andrew Deason Date: Mon Nov 18 23:17:12 2019 -0600 afs: Refactor osi_vnhold/AFS_FAST_HOLD Make a few changes to osi_vnhold and AFS_FAST_HOLD: - Currently, the second argument of osi_vnhold ("retry") is never used by any implementation. Get rid of it. - AFS_FAST_HOLD() is the same as osi_vnhold(). Get rid of AFS_FAST_HOLD, and just have all callers use osi_vnhold instead. - Allow osi_vnhold to return an error, and adjust callers to handle it. - Change osi_vnhold to be a real function, instead of a macro, to make nontrivial implementations less cumbersome. Most platforms never return an error from osi_vnhold(), so the added code paths to check the return value of osi_vnhold() will not trigger. However, this lets us add future commits that do make osi_vnhold() return an error. Change-Id: Id2f3717be6c305d06305685247ac789815e1ebf7 Reviewed-on: https://gerrit.openafs.org/13971 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit d01398731550b8a93b293800642c3e1592099114 Author: Andrew Deason Date: Fri May 1 15:02:08 2020 -0500 vlserver: Return error when growing beyond 2 GiB In the vlserver, when we add a new vlentry or extent block, we grow the VLDB by doing something like this: vital_header.eofPtr += sizeof(item); Since we don't check for overflow, and all of our offset-related variables are signed 32-bit integers, this can cause some odd behavior if we try to grow the database to be over 2 GiB in size. To avoid this, change the two places in vlserver code that grow the database to use a new function, grow_eofPtr(), which checks for 31-bit overflow. If we are about to overflow, log a message and return an error. See the following for a specific example of our "odd behavior" when we overflow the 2 GiB limit in the VLDB: With 1 extent block, we can create 14509076 vlentries successfully. On the 14509077th vlentry, we'll attempt to write the entry to offset 2147483560 (0x7FFFFFA8). Since a vlentry is 148 bytes long, we'll write all the way through offset 2147483707 (0x8000003B), which is over the 31-bit limit. In the udisk subsystem, this results in writing to page numbers 2097151, and -2097152 (since our ubik pages are 1k, and going over the 31-bit limit causes us to treat offsets as negative). These pages start at physical offsets 2147482688 (0x7FFFFC40) and -2147483584 (-0x7FFFFFC0) in our vldb.DB0 (where offset is page*1024+64). Modifying each of these pages involves reading in the existing page first, modifying the parts we are changing, and writing it back. This works just fine for 2097151, but of course fails for -2097152. The latter fails in DReadBuffer when eventually our pread() fails with EINVAL, and causes ubik to log the message: Ubik: Error reading database file: errno=22 But when DReadBuffer fails, DReadBufferForWrite assumes this is due to EOF, and just creates a new buffer for the given page (DNewBuffer). So, the udisk_write() call ultimately succeeds. When we go to flush the dirty data to disk when committing the transaction, after we have successfully written the transaction log, DFlush() fails for the -2097152 page when the pwrite() call eventually fails with EINVAL, causing ubik to panic, logging the messages: Ubik PANIC: Writing Ubik DB modifications When the vlserver gets restarted by bosserver, we then process the transaction log, and perform the operations in the log before starting up (ReplayLog). The log records the actual data we wrote, not split into pages, and the log-replaying code writes directly to the db usying uphys_write instead of udisk_write. So, because of this, the write actually succeeds when replaying the log, since we just write 148 bytes to offset 2147483624 (0x7FFFFFE8), and no negative offsets are used. The vlserver will then be able to run, but will be unable to read that newly-created vlentry, since it involves reading a ubik page beyond the 31-bit boundary. That means trying to lookup that entry will fail with i/o errors, and as well as any entry on the same hash chains as the new entry (since the new entry will be added to the head of the hash chain). Listing all entries in the database will also just show an empty database, since our vital_header.eofPtr will be negative, and we determine EOF by comparing our current blockindex to the value in eofPtr. Change-Id: Ie0b7ac61f9121fa265686449efbae8e18edb1896 Reviewed-on: https://gerrit.openafs.org/14180 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk Reviewed-by: Cheyenne Wills commit d73680c5f70ee5aeb634a9ec88bf1097743d0f76 Author: Cheyenne Wills Date: Mon May 11 14:06:19 2020 -0600 vol: Fix format-truncation warning with gcc-10.1 Building with gcc-10.1 produces a warning (error if --enable-checking) in vol-salvage.c error: ‘%s’ directive output may be truncated writing up to 755 bytes into a region of size 255 [-Werror=format-truncation=] 809 | snprintf(inodeListPath, 255, "%s" OS_DIRSEP "salvage.inodes.%s.%d", tdir, name, Use strdup/asprintf to allocate the buffer dynamically instead of using a buffer with a hardcoded size. Change-Id: Ib2f01c2eb73c7abc162be2b1939e55688a81f812 Reviewed-on: https://gerrit.openafs.org/14207 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit c81579dc7b0c0ac6bc34f63384d705a4445c2bbd Author: Andrew Deason Date: Mon May 18 12:09:38 2020 -0500 auth: Close fd on SetExtendedCellInfo write error Currently, and since OpenAFS 1.0, if write() fails here, we leak the file descriptor. A write() failure should be very unlikely, but close the fd to make sure we avoid the leak. Change-Id: I4e8ed4216c4aa5041232fc798a7bc59f6a5570d9 Reviewed-on: https://gerrit.openafs.org/14213 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 85df3e3d43e033b1c25c33e4a74d4b7b59b567b5 Author: Andrew Deason Date: Sun Jul 21 18:55:49 2019 -0500 afs: Free rx/rxevent resources during shutdown Call shutdown_rx() and shutdown_rxevent() near the end of our shutdown sequence, in order to free various Rx resources and avoid memory leaks. Change-Id: Id2e912295cf760b5ad83057487e6c4c4fadda11b Reviewed-on: https://gerrit.openafs.org/13719 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 17b42fe67c18fab0003fb712092d36f06c93f2eb Author: Cheyenne Wills Date: Thu Apr 30 10:31:17 2020 -0600 LINUX-5.7: replace __pagevec_lru_add with lru_cache_add_file The Linux function __pagevec_lru_add is no longer exported in Linux 5.7-rc1 commit bde07cfc65da5fe6c63fe23f035f5ccc0ffd89e0 "mm/swap.c: not necessary to export __pagevec_lru_add()". As a replacement, the Linux function lru_cache_add_file can be used for adding a page to the lru cache. The internal processing of lru_cache_add_file manages its own internal pagevec and performs the following: get_page(...) if(!pagevec_add(...)) __pagevec_lru_add_file(...) Introduce an autoconf test for lru_cache_add_file and replace the calls associated with __pagevec_lru_add with lru_cache_add_file. NOTE: see Linux commit a0b8cab3b9b2efadabdcff264c450ca515e2619c "mm: remove lru parameter from __pagevec_lru_add and remove parts of pagevec API" as a reference for this change. The lru_cache_add_file was introduced in Linux 2.6.28, therefore this change affects systems with Linux 2.6.28 kernels and later. Change-Id: I12b32fd5061fc136f8b96ef3605e0bab736ca9ed Reviewed-on: https://gerrit.openafs.org/14159 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit dca95bcb7efdff38564dcff3e8f4189735f13b3a Author: Cheyenne Wills Date: Wed Apr 29 16:26:02 2020 -0600 libafs: Abstract the Linux lru cache interface Define static functions afs_lru_cache_init, afs_lru_cache_add and afs_lru_cache_finalize to handle interfacing with Linux's lru facilities. This change's primary purpose is to isolate the preprocessor conditionals associated with the details of the system lru interfaces to just these functions and to simplify the areas that utilize lru caching by removing the preprocessor conditionals. As Linux's lru facilities change, additional conditional code will be needed. Change-Id: I74c94bb712359975e3fd1df85f1b338b215f61b0 Reviewed-on: https://gerrit.openafs.org/14167 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 44b7b93b593371bfdddd0be0ae603f4f8720f78b Author: Andrew Deason Date: Sat May 2 23:54:55 2020 -0500 afs: Drop GLOCK for RXAFS_GetCapabilities We are hitting the net here; we certainly should not be holding AFS_GLOCK while waiting for the server's response. Found via FreeBSD WITNESS. Change-Id: Ie727db27adaeed23ac8cff7665143bae2ce2ede8 Reviewed-on: https://gerrit.openafs.org/14181 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 5d53ed0bdab6fea6d2426691bdef2b6f9cb7f2fe Author: Yadavendra Yadav Date: Wed Apr 29 05:10:05 2020 +0000 rxkad: Use krb5_enctype_keysize in tkt_DecodeTicket5 Inside tkt_DecodeTicket5 (rxkad/ticket5.c) function, keysize is calculated using krb5_enctype_keybits and then dividing number of bits by 8. For 3DES number of keybits are 168, so keysize comes out to 21(168/8). However actual keysize of 3DES key is 24. This keysize is passed to _afsconf_GetRxkadKrb5Key where keysize comparison happens, since there is keysize mismatch it returns AFSCONF_BADKEY. To fix this issue get keysize from krb5_enctype_keysize function instead of krb5_enctype_keybits. Thanks to John Janosik (jpjanosi@us.ibm.com) for analyzing and fixing this issue. Change-Id: Ia6f70b878feaa91855f9544ec1de81a6196a85a8 Reviewed-on: https://gerrit.openafs.org/14203 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 9866511bb0a5323853e97e3ee92524198813776e Author: Andrew Deason Date: Sun Jul 21 18:48:51 2019 -0500 rx: Avoid osi_NetSend during rx shutdown Commit 8d939c08 (rx: avoid nat ping during shutdown) added a call to shutdown_rx() inside the DARWIN shutdown sequence, before the rx socket was closed. From the commit message, it sounds like this was done to avoid NAT pings from calling osi_NetSend during the shutdown sequence after the rx socket was closed; calling shutdown_rx() before closing the socket would cause any connections we had to be destroyed first, avoiding that. The problem with this is that this means shutdown_rx() is called when osi_StopNetIfPoller is called, which is much earlier than some other portions of the shutdown sequence; some of which may hold references to e.g. rx connections. If we try to, for instance, destroy an rx connection after shutdown_rx() is called, we could panic. An earlier version of that commit (gerrit PS1) just tried to insert a check before the relevant osi_NetSend call, making us just skip the osi_NetSend if the shutdown sequence had been started. So to avoid the above issue, try to implement that approach instead. And instead of doing it just for NAT pings, we can do it for almost all osi_NetSend calls (besides those involved in the shutdown sequence itself), by checking this in rxi_NetSend. Also return an error (ESHUTDOWN) if we skip the osi_NetSend call, so we're not completely silent about doing so. This means we also remove the call to shutdown_rx() inside DARWIN's osi_StopNetIfPoller(). This allows us to interact with Rx objects during more of the shutdown process in cross-platform code. Change-Id: I4e631b28d090635aeacd59de0fd237d572f97e93 Reviewed-on: https://gerrit.openafs.org/13718 Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 929d501421579290ce1d4f9aabe45980e5458a9a Author: Cheyenne Wills Date: Fri Apr 3 15:00:42 2020 -0600 Add more 'fall through' switch comments Commit a455452d (LINUX 5.3: Add comments for fallthrough switch cases) added the special /* fall through */ comment to various switch/case blocks, in order to avoid implicit-fallthrough warnings from causing the build to fail when building the Linux kernel module. In this commit, add additional /* fall through */ comments to the rest of the tree where falling through is intentional. Add a "break;" in one place in dumptool.c where falling through seems like a mistake, and flag certain functions as AFS_NORETURN to avoid needing to explicitly break or fallthrough. Check for the availability of the -Wimplicit-fallthrough compiler flag and use it when --enable-checking is set, to prevent additional cases from creeping into the tree. Note: the -Wimplicit-fallthrough compiler flag was added in gcc 7. Change-Id: Iae34e7969606603da8358d7cfa5fd04279b218dc Reviewed-on: https://gerrit.openafs.org/14125 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 4512d04a9b721cd9052c0e8fe026c93faf6edb9e Author: Kailas Zadbuke Date: Thu May 7 23:55:39 2020 -0400 salvaged: Fix "-parallel all" parsing In salavageserver -parallel option takes "all" argument. However the code does not parse the numeric part correctly. Due to this, only single instance of salvageserver process was running even if we provide the larger number with "all" argument. With this fix, numeric part of "all" argument will be parsed correctly and will start required number of salvageserver instances. Change-Id: Ib6318b1d57d04fecb84915e2dabe40930ea76499 Reviewed-on: https://gerrit.openafs.org/14201 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 790824ff749b6ee01c4d7101493cbe8773ef41c6 Author: Cheyenne Wills Date: Sun Apr 5 15:51:17 2020 -0600 cf: Use common macro to test compiler flags Use the AX_APPEND_COMPILE_FLAGS macro to test and set compiler specific flags. Remove the OPENAFS_GCC_SUPPORTS_MARCH check entirely (and the associated P5PLUS_KOPTS), since nothing has used it for quite some time. Change-Id: Ic9626c52ac62cf83d4b8c787aa5aa966e558a781 Reviewed-on: https://gerrit.openafs.org/14132 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 98b5ffb52117aefac5afb47b30ce9b87eb2fdebf Author: Andrew Deason Date: Mon Apr 20 13:03:15 2020 -0500 ubik: Avoid unlinking garbage during recovery In urecovery_Interact, if any of our operations fail around calling DISK_GetFile, we will jump to FetchEndCall and eventually unlink 'pbuffer'. But if we failed before opening our .DB0.TMP file, the contents of 'pbuffer' will not be initialized yet. During most iterations of the recovery loop, the contents of 'pbuffer' will be filled in from previous loops, and it should always stay the same, so it's not a big problem. But if this is the first iteration of the loop, the contents of 'pbuffer' may be stack garbage. Solve this in two ways. To make sure we don't use garbage contents in 'pbuffer', memset the whole thing to zeroes at the beginning of urecovery_Interact(). And then to make sure we're not reusing 'pbuffer' contents from previous iterations of the loop, also clear the first character to NUL each time we arrive at this area of the recovery code. And avoid unlinking anything if pbuffer starts with a NUL. Commit 44e80643 (ubik: Avoid unlinking garbage) fixes the same issue, but only fixed it in the SDISK_SendFile codepath in remote.c. Change-Id: Ica39e66efa89562068a4be3a14b2d13594b77f6d Reviewed-on: https://gerrit.openafs.org/14153 Tested-by: BuildBot Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk commit ca847ddf35e336a8bc3159ce4b26f0162417bbd5 Author: Andrew Deason Date: Sat Apr 4 22:35:07 2020 -0500 Use autoconf-archive m4 from src/external Switch to using the m4 macros from autoconf-archive in our src/external mechanism, instead of manually-copied versions in src/cf. The src/external copy of ax_gcc_func_attribute.m4 is identical to the existing copy in src/cf, so that should incur no changes. There are also a few new macros pulled in, but they are currently unused. Increase our AC_PREREQ in configure.ac to 2.64, to match the AC_PREREQ in some of the new files. Change-Id: I8acfe4df7b9a22d9b9e69004c3438034a2dacadb Reviewed-on: https://gerrit.openafs.org/14135 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit d8205bbb482554812fbe66afa3c337d991a247b6 Author: Autoconf Archive Maintainers Date: Tue Apr 7 10:23:16 2020 -0500 Import of code from autoconf-archive This commit updates the code imported from autoconf-archive to 24358c8c5ca679949ef522964d94e4d1cd1f941a (v2019.01.06) New files are: m4/ax_append_compile_flags.m4 m4/ax_append_flag.m4 m4/ax_check_compile_flag.m4 m4/ax_gcc_func_attribute.m4 m4/ax_require_defined.m4 Change-Id: I64e14d1b4d41ebfee82fa92da10239f73e28b4c9 Reviewed-on: https://gerrit.openafs.org/14138 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit a072c65bba86cbcd81157e354d3719ac41a2c97d Author: Andrew Deason Date: Sat Apr 4 22:28:21 2020 -0500 Add autoconf-archive to src/external Add autoconf-archive to the src/external mechanism, so we can more easily import and update the AX_* m4 macros we pull in from autoconf-archive. Commits are imported from . We already have a copy of ax_gcc_func_attribute.m4 in the tree, so include that in the list of files. While we're here, also include a few more macros for checking compiler flags, which will be used in subsequent commits. Change-Id: I8c6288fc1d48a47837ca08f8b9207e0ada921af8 Reviewed-on: https://gerrit.openafs.org/14133 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit c05d8b28d3213856d54896979382daa066b64673 Author: Michael Meffie Date: Fri Jul 5 09:28:50 2019 -0400 Update NEWS for OpenAFS 1.9.0 Add change descriptions for commits not in a stable release. Change-Id: Ib1d5ce9f558279660abb2473ce8a9fac4fcefa8d Reviewed-on: https://gerrit.openafs.org/13673 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 1547db22264f21b5d553f54498aee51879539786 Author: Benjamin Kaduk Date: Fri Mar 20 09:17:13 2020 -0700 Synchronize NEWS with 1.8.5 Pull in all the updates to NEWS that occurred on the 1.8.x branch in preparation for adding entries for 1.9.0. Change-Id: I713d1576ef96793f24824f909b26da802b21ec23 Reviewed-on: https://gerrit.openafs.org/14103 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit befc72749884c6752c7789479343ba48c7d5cea1 Author: Andrew Deason Date: Sun Apr 26 17:26:02 2020 -0500 rx: Use _IsLast to check for last call in queue Ever since commits 170dbb3c (rx: Use opr queues) and d9fc4890 (rx: Fix test for end of call queue for LWP), rx_GetCall checks if the current call is the last one on rx_incomingCallQueue by doing this: opr_queue_IsEnd(&rx_incomingCallQueue, cursor) But opr_queue_IsEnd checks if the given pointer is the _end_ of the last; that is, if it's the end-of-list sentinel, not an item on the actual list. Testing for the last item in a list is what opr_queue_IsLast is for. This is the same convention that the old Rx queues used, but 170dbb3c just accidentally replaced queue_IsLast with opr_queue_IsEnd (instead of opr_queue_IsLast), and d9fc4890 copied the mistake. So because this is inside an opr_queue_Scan loop, opr_queue_IsEnd will never be true, so we'll never enter this block of code (unless we are the "fcfs" thread). This means that an incoming Rx call can get stuck in the incoming call queue, if all of the following are true: - The incoming call consists of more than 1 packet of incoming data. - The incoming call "waits" when it comes in (that is, there are no free threads or the service is over quota). - The "fcfs" thread doesn't scan the incoming call queue (because it is idle when the call comes in, but the relevant service is over quota). To fix this, just use opr_queue_IsLast here instead of opr_queue_IsEnd. Change-Id: I04b90b1279f81dc518eb61e7bd450e3c0be37a77 Reviewed-on: https://gerrit.openafs.org/14158 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit ebaefc5a06fb3b559ce3649676197d0a989efbde Author: Andrew Deason Date: Sat Apr 25 18:21:10 2020 -0500 tests: Give more leeway in rx/event-t Currently, the rx/event-t tests schedule a bunch of events up to 3 seconds in the future, and then we sleep for 3 seconds to give them a chance to run. Since we're cutting it so close, this can rarely result in a few events not being run (observed occasionally on FreeBSD 12.1, where we failed to run about 3 events out of 10000). To avoid this, just sleep for 4 seconds instead of 3. Also print out a little more info regarding the number of fired/cancelled events, so we can see the event count when it's wrong. Change-Id: I6269bea2c245aeed00c129ff638423d0fa81ad23 Reviewed-on: https://gerrit.openafs.org/14160 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 2b4908d3be8c4bde135d836ccc4ca96e465628c3 Author: Mark Vitale Date: Thu Apr 23 17:49:20 2020 -0400 afs: fix afs_linux_mmap fstrace entry The format string for CM_TRACE_GMAP takes 4 substitutions, but afs_linux_mmap only supplies 3. This results in malformed output from fstrace: Type mismatch, using raw print. Gn_map vp 0x%lx addr 0x%lx len 0x%x off 0x%x (afs / zcm)raw op 701087775, time 715.322573, pid 9644 p0:0xc0a66ec0 p1:0x8b81a000 p2:131072 Repair the recording of CM_TRACE_GMAP. Change-Id: I2b7592e68cb42f5ae490ee8771558e5cc5a2181e Reviewed-on: https://gerrit.openafs.org/14168 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit df5480057c2994914e22bd14b169dbcd8857485a Author: Andrew Deason Date: Sun Apr 12 22:28:29 2020 -0500 tests: Skip SIGBUS test on FreeBSD Currently, 'softsig-helper -buserror' causes a SIGBUS on most platforms, but can result in SIGSEGV on FreeBSD by default (at least on 11.3-RELEASE). Skip the test on FreeBSD, until we can provide a more reliable way to generate SIGBUS. Note that when the sysctl machdep.prot_fault_translation is set to 1, 'softsig-helper -buserror' generates a SIGBUS instead of SIGSEGV, suggesting that generating a SIGBUS here is the old 'compat' behavior. When machdep.prot_fault_translation is 0 (the default), the code path in the FreeBSD kernel that dictates whether to send a SIGBUS or SIGSEGV in this situation depends on some autodetection heuristics, and so may produce different results depending on FreeBSD releases or even compiler settings (due to detection of ABI based on some ELF notes in the relevant binary). For some details on this sysctl, see or the FreeBSD source code. In 11.3-RELEASE, the decision to issue a SIGBUS or SIGSEGV can be found around sys/amd64/amd64/trap.c:355. Change-Id: Ib75b43cc12302532ee87a3744fc364424f2a3ca6 Reviewed-on: https://gerrit.openafs.org/14145 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 61993cf45a648906abb865756d5a98d9c2d7cc40 Author: Andrew Deason Date: Tue Nov 26 23:39:24 2019 -0600 FBSD: Avoid holding AFS_GLOCK during vinvalbuf Currently we call vinvalbuf(9) in a few places while holding AFS_GLOCK, but AFS_GLOCK is a non-sleepable lock (struct mtx), and vinvalbuf can sleep. This can trigger a panic in some rare conditions, with the message: Sleeping thread (tid 100179, pid 95481) owns a non-sleepable lock To avoid this, drop AFS_GLOCK around a few places that call vinvalbuf(). Change-Id: I58acb144b6ffa007675402e7639b63ff3745dec5 Reviewed-on: https://gerrit.openafs.org/13970 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit e510e35b25f605090524598b6b48cd20d3102945 Author: Andrew Deason Date: Sun Sep 15 23:00:26 2019 -0500 afs: Fix ifdef indenting in afs_vcache.c Change-Id: Ib566156184cb3f64a0983babd5d9f7883c84cc85 Reviewed-on: https://gerrit.openafs.org/13877 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 7260c7164b9a2199c7b5f83279fa18af16e7d387 Author: Andrew Deason Date: Sun Sep 8 16:10:40 2019 -0500 FBSD: Remove MA_* abstractions In FBSD/osi_vnops.c, we have a few abstractions (e.g. MA_VOP_UNLOCK) that used to expand to different things for older FreeBSD versions. Currently, they always expand to the same thing, so just remove the abstractions. While we are changing these calls, also change one instance of MA_VOP_LOCK to vn_lock (instead of VOP_LOCK), since we're not usually supposed to call VOP_LOCK directly, according to the VOP_LOCK(9) manpage. The MA_VOP_LOCK call was added in commit bd707fb7 (freebsd-almost-working-client-20020216), seemingly by mistake. Change-Id: Ia0f28fe658057e87d9103a72296ab899dc762fb6 Reviewed-on: https://gerrit.openafs.org/13843 Reviewed-by: Tim Creech Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 0ee53d2fe9341e60f420662749d5ae8c6d4b5f24 Author: Tim Creech Date: Fri Dec 13 22:24:57 2019 -0500 FBSD: Build vnode_if.h before libafs objs Currently, if we are building with -j2 or higher, we can easily fail to build some libafs objects because vnode_if.h does not exist yet. vnode_if.h is generated by the FreeBSD build, but none of our objects depend on it, so during parallel builds it may not be available by the time we build, for example, src/external/heimdal/hcrypto/sha256.c. This results in build errors that can look like this: --- sha256-kernel.o --- cc -I. -I.. -I../nfs [...]/src/external/heimdal/hcrypto/sha256.c In file included from [...]/src/external/heimdal/hcrypto/sha256.c:34: In file included from [...]/src/crypto/hcrypto/kernel/config.h:30: In file included from [...]/src/afs/sysincludes.h:354: /usr/src/sys/sys/vnode.h:588:10: fatal error: 'vnode_if.h' file not found #include "vnode_if.h" ^~~~~~~~~~~~ 1 error generated. *** [sha256-kernel.o] Error code 1 make[4]: stopped in [...]/src/libafs/MODLOAD 1 error To avoid this, make all of our libafs objects depends on vnode_if.h. [adeason@dson.org: Expanded commit message.] Change-Id: I5a7a6ece8d5fbe6cf1a5b94451c8e8ae93fdc55f Reviewed-on: https://gerrit.openafs.org/13983 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 1bd03c9c22ca7f36b9f1647c258b5f18c8ac92c0 Author: Andrew Deason Date: Sun Apr 12 20:16:55 2020 -0500 tests: Run perl via 'env' The 'perl' binary may not be /usr/bin/perl, depending on the system. For example, on modern FreeBSD it tends to be /usr/local/bin/perl instead. To avoid relying on perl to be in a specific location, just run via /usr/bin/env instead, so we pick up perl from $PATH instead. Change-Id: Ic8dc247c82342ff79dfa80426c489ccb8e3e1450 Reviewed-on: https://gerrit.openafs.org/14144 Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 17a845c8d44f453b09b21afd59182e616234e872 Author: Tim Creech Date: Sun Mar 5 18:15:58 2017 -0500 FBSD: Remove LOCKPARENT/ISLASTCN lookup logic Currently, our afs_vop_lookup on FBSD tries to only lock 'dvp' for ISDOTDOT requests when LOCKPARENT and ISLASTCN are set. There are a couple of problems with this: - The conditional locking logic involving LOCKPARENT/ISLASTCN is only relevant in very old FreeBSD releases (per-fs checking of these flags for parent locking went away around the FreeBSD 6 era). - Our current logic here is wrong anyway, since we try to lock 'dvp' twice when those flags are set. This was mostly introduced by commit 2f6be821 (FBSD: band-aid vnode locking in lookup), which added a lock/unlock pair for 'dvp' around the lock for 'vp', even though 'dvp' was unlocked several lines earlier. This means that if we hit the relevant code path, we will deadlock, since we try to lock 'dvp' twice. To avoid this, just remove the relevant logic for LOCKPARENT/ISLASTCN, since it is only relevant for old FreeBSD releases that are not supported by us or FreeBSD. Add and rearrange some comments around here to try to more explicitly explain the relevant locking rules. [adeason@dson.org: Commit message rewrite, adding comments, removing old FreeBSD code.] Change-Id: Iaa2c55d82c50d5a8ab42c67b0996a2b4fb6e09e6 Reviewed-on: https://gerrit.openafs.org/12578 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 7df5c003ed6eb17a693d67ffdfc0556f0c569cc1 Author: Andrew Deason Date: Sun Apr 12 22:40:14 2020 -0500 FBSD: Remove unused 'wantparent' logic In afs_vop_lookup, the 'wantparent' variable doesn't actually change any logic in the function. In the if() clause that it's used, the value of 'wantparent' is only ever used if cnp->cn_nameiop is RENAME and ISLASTCN is set. But if both of those are true, then the second half of the if() conditional will always be true, so the value of 'wantparent' doesn't matter. So to remove this confusing unused logic, remove the 'wantparent' local var, and all its associated logic. Issue spotted by kaduk@mit.edu. Change-Id: Ia63b88d67d21cc2b81a0c25aa31ea60ab202b0a7 Reviewed-on: https://gerrit.openafs.org/14143 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 7116de596a8f1d0be3da6eebe92d486f57aefd02 Author: Andrew Deason Date: Sun Aug 18 19:59:50 2019 -0500 FBSD: Add support for FreeBSD 11.3 Change-Id: Ibe3496f06da83a0b30182ea92081bae41fe766f3 Reviewed-on: https://gerrit.openafs.org/13792 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 8002a46125e8224ba697c194edba5ad09e4cfc44 Author: Yadavendra Yadav Date: Wed Apr 15 05:33:00 2020 -0500 LINUX: Always crref after _settok_setParentPag Commit b61eac78 (Linux: setpag() may replace credentials) changed PSetTokens2 to call crref() after _settok_setParentPag(), since changing the parent PAG may change our credentials structure. But that commit did not update the old pioctl PSetTokens, so -setpag functionality remained broken on Linux for utilities that called the old pioctl ('klog' is one such utility). To fix this, we could copy the same code from PSetTokens2 into PSetTokens. But instead just move this code into _settok_setParentPag itself, to avoid code duplication. This commit also refactors _settok_setParentPag a little to make the platform-specific ifdefs a little easier to read through. Change-Id: I65a165ebb1d823e690926de31b28a7728d2561b9 Reviewed-on: https://gerrit.openafs.org/14147 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Yadavendra Yadav Reviewed-by: Benjamin Kaduk commit 826bb826274e48c867b41cb948d031a423373901 Author: Yadavendra Yadav Date: Wed Apr 15 05:33:00 2020 -0500 LINUX: Copy session keys to parent in SetToken Commit 48589b5d (Linux: Restore aklog -setpag functionality for kernel 2.6.32+) added code to SetToken() to copy our session keyring to the parent process, in order to implement -setpag functionality. But this was removed from SetToken() in commit 1a6d4c16 (Linux: fix aklog -setpag to work with ktc_SetTokenEx), when the same code was moved to ktc_SetTokenEx(). Add this code back to SetTokens(), so -setpag functionality can work again with utilities that use older functions like ktc_SetToken, like 'klog'. Change-Id: I68c9bf2e19783ea6f84b4c5ebf2ef188d1d8d6ad Reviewed-on: https://gerrit.openafs.org/14146 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit be50d9a517bda9f421414341bca34c0100d61ba0 Author: Michael Meffie Date: Fri Mar 20 18:17:56 2020 -0400 redhat: add make to the build requirements `make` is not necessarily installed, even if when all the other build requirements are installed. Add `make` to the list build requirements to complete the build requirements. With this change it is possible to build the packages after running the `yum-builddep` to install all of the needed build requirements. Change-Id: I032ba1f23d08468c5e21edc5662b20cc9498d1c9 Reviewed-on: https://gerrit.openafs.org/14119 Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 7e41ee0bd50d39a356f0435ff370a0a7be40306f Author: Andrew Deason Date: Tue Apr 7 13:15:31 2020 -0500 vlserver: Correctly pad nvlentry for "O" RPCs For our old-style "O" RPCs (e.g. VL_CreateEntry, instead of VL_CreateEntryN), vlserver calls vldbentry_to_vlentry to convert to the internal 'struct nvlentry' format. After all of the sites have been copied to the internal format, we fill the remaining sites by setting the serverNumber to BADSERVERID. For nvldbentry_to_vlentry, we do this for NMAXNSERVERS sites, but for vldbentry_to_vlentry, we do this for OMAXNSERVERS. The thing is, both functions are filling in entries for a 'struct nvlentry', which has NMAXNSERVERS 'serverNumber' entries. So for vldbentry_to_vlentry, we are skipping setting the last few sites (specifically, NMAXNSERVERS-OMAXNSERVERS = 13-8 = 5). This can easily cause our O-style RPCs to write out entries to disk that have uninitialized sites at the end of the array. For example, an entry with one site should have server numbers that look like this: serverNumber = {1, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255} That is, one real serverid (a '1' here), followed by twelve BADSERVERIDs. But for a VL_CreateEntry call, the 'struct nvlentry' is zeroed out before vldbentry_to_vlentry is called, and so the server numbers in the written entry look like this: serverNumber = {1, 255, 255, 255, 255, 255, 255, 255, 0, 0, 0, 0, 0} That is, one real serverid (a '1' here), followed by seven BADSERVERIDs, followed by five '0's. Most of the time, this is not noticeable, since our code that reads in entries from disk stops processing sites when we encounter the first BADSERVERID site (see vlentry_to_nvldbentry). However, if the entry has 8 sites, then none of the entries will contain BADSERVERID, and so we will actually process the trailing 5 bogus sites. This would appear as 5 extra volume sites for a volume, most likely all for the same server. For VL_CreateEntry, the vlentry struct is always zeroed before we use it, so the trailing sites will always be filled with 0. For VL_ReplaceEntry, the trailing sites will be unchanged from whatever was read in from the existing disk entry. To fix this, just change the relevant loop to go through NMAXNSERVERS entries, so we actually go to the end of the serverNumber (et al) array. This may appear similar to commit ddf7d2a7 (vlserver: initialize nvlentry elements after read). However, that commit fixed a case involving the old vldb database format (which hopefully is not being used). This commit fixes a case where we are using the new vldb database format, but with the old RPCs, which may still be used by old tools. Change-Id: Ic6882d1452963ca93403748917c313068acfdaab Reviewed-on: https://gerrit.openafs.org/14139 Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 30a47c3282cb405459a6fced1fe5b4c77f4afd64 Author: Michael Meffie Date: Fri Mar 20 17:53:22 2020 -0400 redhat: fix rpmbuild warnings Fix warnings issued by recent versions of rpmbuild: warning: Macro expanded in comment on line 110: %{afsvers}/... warning: extra tokens at the end of %endif directive in line 1469: %endif # build_userspace warning: line 331: It's not recommended to have unversioned Obsoletes: Obsoletes: openafs-client-compat The first two warnings are just issues with comments, which apparently are not completely ignored by rpmbuild. The third issue is a warning about an unversioned "Obsoletes" directive. Remove the old Obsoletes for openafs-client-compat, which was obsoleted no later than the 1.4.x series (more than 10 years ago). While here clean up the spec by removing the old cvs $Revsion$ keyword from the comments at the top of the file, and removing an old commented out setup directive. Change-Id: I8d7a050ea6a0cc7a2d9a6af9a91d25ce545586e7 Reviewed-on: https://gerrit.openafs.org/14118 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 19524a49d4389bff6f7ba9d9c355489450579c01 Author: Andrew Deason Date: Mon Mar 30 14:21:21 2020 -0500 opr: Allow non-2^x for n_buckets in opr_cache_init Currently, opr_cache_init requires that opts->n_buckets is a power of 2 (since our underlying opr_dict requires this). However, callers may want to pick a number of buckets based on some other value. Requiring each caller to calculate the nearest power-of-2 is annoying, so instead just have opr_cache_init itself calculate a nearby power of 2. That is, with this commit, opts->n_buckets is allowed to not be a power of 2; when it's not a power of 2, opr_cache_init will calculate the next highest power of 2 and use that as the number of buckets. Change-Id: Icd3c56c1fe0733e3dac964ea9a98ff7b436254e6 Reviewed-on: https://gerrit.openafs.org/14122 Tested-by: BuildBot Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk commit 3db8c37e8ef6bea0f03ef6b8f82ed93d52937d7d Author: Andrew Deason Date: Sun Apr 5 16:29:52 2020 -0500 libafs: Serialize INSTDIRS/DESTDIRS and COMPDIRS Our libafs build logic involves a few targets that 'cd' into a per-kernel subdir: notably INSTDIRS and DESTDIRS (the targets to 'make install' or 'make dest' our kernel modules) and COMPDIRS (the target to setup/build the kernel module). Both of these potentially 'cd' into a subdirectory (e.g. MODLOAD64), and run some make rules. Since INSTDIRS and COMPDIRS are different targets and don't depend on each other for many platforms, running those rules can happen in parallel. After they 'cd' into the relevant dir, they run a new 'make' in a subshell, and so underlying rules for building e.g. AFS_component_version_number.c are not serialized. So for a parallel build on, say, Solaris, we can encounter errors when two sub-makes try to make AFS_component_version_number.c at the same time, which looks something like this (with various lines output from other sub-processes mixed in): cd src && cd sys && gmake install gmake[3]: Leaving directory '/[...]/src/libuafs' rm -f AFS_component_version_number.c.NEW /opt/developerstudio12.6/bin/cc [...] -D_KERNEL -DSYSV -dn -m64 -xmodel=kernel -xvector=%none -xregs=no%float -Wu,-save_args -o AFS_component_version_number.o -c AFS_component_version_number.c mv: cannot access AFS_component_version_number.c.NEW gmake[4]: *** [/[...]/src/config/Makefile.version:13: AFS_component_version_number.c] Error 2 gmake[4]: Leaving directory '/[...]/src/libafs/MODLOAD64' gmake[3]: *** [Makefile:85: solaris_instdirs] Error 2 gmake[3]: *** Waiting for unfinished jobs.... To avoid this, just make INSTDIRS and DESTDIRS depend on COMPDIRS, so we can make sure they don't run at the same time. Change-Id: I2510e1894c44dd0864cf2eab5613b805342b6718 Reviewed-on: https://gerrit.openafs.org/14137 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 80edcab9997807f91798dacc2cc59efdba74be56 Author: Cheyenne Wills Date: Wed Apr 1 09:38:05 2020 -0600 butc: rename local var tapeblocks to numTapeblocks The local variable tapeblocks in GetConfigParams matches a global variable. Rename the local variable to avoid confusion with the global name. Change-Id: I1c30433696a35a74978ef0c23881c82054b416c5 Reviewed-on: https://gerrit.openafs.org/14128 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 8ae4531c5720baff9e11e4b05706eab6c82de5f9 Author: Michael Meffie Date: Mon Mar 23 09:46:05 2020 -0400 build: remove unused LINUX_PKGREL from configure.ac This change removes the unused LINUX_PKGREL definition from the configure.ac file. Commit 6a27e228bac196abada96f34ca9cd57f32e31f5c converted the setting of the RPM package version and release values in the openafs.spec file from autoconf to the makesrpm.pl script. That commit left LINUX_PKGREL in configure.ac because it was still referenced by the Debian packaging, which was still in-tree at that time. Commit ada9dba0756450993a8e57c05ddbcae7d1891582 removed the last trace of the Debian packaging, but missed the removal of the LINUX_PKGREL. Change-Id: I17aeccdb38078faa413f2cd3a935b43238982606 Reviewed-on: https://gerrit.openafs.org/14117 Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit f16d40ad26df3ec871f8c73952594ad2e723c9b4 Author: Andrew Deason Date: Wed Apr 1 22:59:38 2020 -0500 vos: Print "done" in non-verbose 'vos remsite' Currently, 'vos remsite' always prints the message "Deleting the replication site for volume %lu ...", and then calls VDONE if the operation is successful. VDONE prints the trailing "done", but only if -verbose is turned on, and so if -verbose is not specified, the output of 'vos remsite' looks broken: $ vos remsite fs1 vicepa vol.foo Deleting the replication site for volume 1234 ...Removed replication site fs1 /vicepa for volume vol.foo To fix this, unconditionally print the trailing "done", instead of going through VDONE, so 'vos remsite' output now looks like this: $ vos remsite fs1 vicepa vol.foo Deleting the replication site for volume 1234 ... done Removed replication site fs1 /vicepa for volume vol.foo Change-Id: I0b42f4cb9b695331bf047243bf6ae4a1cdbb89c4 Reviewed-on: https://gerrit.openafs.org/14127 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 0e2072ae386d4111bef161eb955964b649c31386 Author: Cheyenne Wills Date: Wed Apr 1 09:48:57 2020 -0600 Avoid duplicate definitions of globals GCC 10 changed a default flag from -fcommon to -fno-common. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85678 for some background. The change in gcc 10 results in build link-time errors. For example: ../../src/xstat/.libs/liboafs_xstat_cm.a(xstat_cm.o):(.bss+0x2050): multiple definition of `numCollections'; Ensure that only one definition for global data objects exist and change references to use "extern" as needed. To ensure that future changes do not introduce duplicated global definitions, add the -fno-common flag to XCFLAGS when using the configure --enable-checking setting. Change-Id: I6780dd995fe6fb6c2102765ff3484c18e1e1cd58 Reviewed-on: https://gerrit.openafs.org/14106 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit f841c189a53f3a6bcf5c25336e4e0ad5362036e2 Author: Andrew Deason Date: Tue Mar 31 21:19:18 2020 -0500 vos: Properly print volume transaction flags Currently, the code in 'vos status' treats the 'iflags' and 'vflags' of a transaction like an enumerated type; that is, we only check if 'iflags' is equal to ITOffline or ITBusy, etc. But both of these flags fields are bitfields; any combination of the relevant flags could theoretically be set. Practically speaking, we only ever set at most one of the flags in 'iflags', but if anything ever did set more than one flag, our output would look broken (we'd print "attachFlags:" without any flags). For 'vflags', multiple flags are often set at once: the most common combination is VTDeleteOnSalvage|VTOutOfService. So currently, we usually print "attachFlags:" without any actual flags, since the 'vflags' field isn't exactly equal to VTDeleteOnSalvage (instead it's set to VTDeleteOnSalvage|VTOutOfService). And if we ever did see just VTDeleteOnSalvage set by itself, the way the switch() cases fall through to each other, we'd print out that _all_ flags are set. To fix all of this, just test for the individual flag bits instead. Change-Id: Ib4d207bc713f0ef8eb51b9dbeaf2af50395536ee Reviewed-on: https://gerrit.openafs.org/14126 Tested-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 4c4fb6e36634e5663c8be25acd4a1ac872e4738c Author: Andrew Deason Date: Tue Jul 23 13:50:31 2019 -0500 LINUX: Introduce afs_d_path Move our preprocessor logic around d_path into an osi_compat.h wrapper, called afs_d_path. This just makes it a little easier to use d_path, and moves a tiny bit of #ifdef cruft away from real code. Change-Id: I2032eda3fef18be6e77e3bf362ec5ce641e1d76d Reviewed-on: https://gerrit.openafs.org/13721 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 252b3bcc75ea141ff93a7b3147865f4b952fcaca Author: Andrew Deason Date: Fri Aug 24 13:03:24 2018 -0500 afs: Detect VIOCPREFETCH special case properly Currently, afs_syscall_pioctl handles the VIOCPREFETCH pioctl as a special case, calling into a different code path to handle backgrounding the prefetch operation. However, we detect that we're handling a VIOCPREFETCH operation just by looking at the lower 8 bits of the given opcode. This means that any pioctl that ends in 0x0F will trigger this codepath, such as if we add a 'C' or 'O' pioctl that uses code 0x0F. We only want to catch VIOCPREFETCH requests for this code path, so fix the check to also check if we're processing a 'V' pioctl. Change-Id: Ica8c2364f96aa3c8b4d2213bebd9a1e4cb6fa730 Reviewed-on: https://gerrit.openafs.org/13301 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 66d0f91791695ac585f0511d0dadafd4e570b1bf Author: Andrew Deason Date: Tue Mar 24 11:59:48 2020 -0500 tests: Wait for server start in auth/superuser-t The auth/superuser-t test runs an Rx server and client in two child processes. If the client process tries to contact the server before the server has started listening on its port, some tests involving RPCs can fail (notably test 39, "Can run a simple RPC"). Normally if we try to contact a server that's not there, Rx will try resending its packets a few times, but on Linux with AFS_RXERRQ_ENV, if the port isn't open at all, we can get an ICMP_PORT_UNREACH error, which causes the relevant Rx call to die immediately with RX_CALL_DEAD. This means that if the auth/superuser-t client is only just a bit faster than the server starting up, tests can fail, since the server's port is not open yet. To avoid this, we can wait until the server's port is open before starting the client process. To do this, have the server process send a SIGUSR1 to the parent after rx_Init() is called, and have the parent process wait for the SIGUSR1 (waiting for a max of 5 seconds before failing). This should guarantee that the server's port will be open by the time the client starts running. Note that before commit 086d1858 (LINUX: Include linux/time.h for linux/errqueue.h), AFS_RXERRQ_ENV was mistakenly disabled on Linux 3.17+, so this issue was probably not possible on recent Linux before that commit. Change-Id: I0032a640b83c24f72c03e7bea100df5bc3d9ed4c Reviewed-on: https://gerrit.openafs.org/14109 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk Reviewed-by: Cheyenne Wills commit 18a0ea2f31e70e1bdbd7af40022ab107560ac0d0 Author: Andrew Deason Date: Tue Mar 24 11:34:51 2020 -0500 LINUX: Clear lock 'pid' fields with NULL Currently, when we release a lock, we set the e.g. pid_writer field to 0, to clear out any previous pid that was set. On Linux, the pid_writer field is a pointer, and sparse(1) complains about using a plain integer 0 in this way: CHECK [...]/afs_axscache.c [...]/afs_axscache.c:24:19: warning: Using plain integer as NULL pointer [...]/afs_axscache.c:68:9: warning: Using plain integer as NULL pointer [...]/afs_axscache.c:88:5: warning: Using plain integer as NULL pointer [...]/afs_axscache.c:111:13: warning: Using plain integer as NULL pointer [...]/afs_axscache.c:121:17: warning: Using plain integer as NULL pointer [...]/afs_axscache.c:126:17: warning: Using plain integer as NULL pointer [...]/afs_axscache.c:154:13: warning: Using plain integer as NULL pointer [...]/afs_axscache.c:165:9: warning: Using plain integer as NULL pointer This doesn't break anything, but it spews out quite a lot of warnings when building with sparse(1) available. To just reduce this noise a bit, assign these fields to actual NULL. Since some other platforms do use a plain integer in these fields (they are an actual pid), define 'MyPid_NULL' to use '0' or 'NULL' depending on the platform. Define MyPid_NULL to NULL only on Linux; this causes us to still assign 0 to a pointer on some platforms, but Linux is the only one that complains, so only bother using NULL on Linux for now. Change-Id: I35fcb896ceaa346c330622cfc2913b2975295836 Reviewed-on: https://gerrit.openafs.org/14108 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit abbadd3f4bf6fddc794b87d8d993ed6536c591e3 Author: Andrew Deason Date: Tue Mar 10 16:05:47 2020 -0500 rxgen: Properly generate brief union default arm Commit 13ae3de3 (Add "brief" option to rxgen) added the -b option to rxgen, which (among other things) makes rxgen stop including the name of an RPC-L union type within its fields. That is, instead of this: struct foo_type { afs_int32 foo_tag; union { /* ... */ } foo_type_u; }; rxgen -b generates this: struct foo_type { afs_int32 foo_tag; union { /* ... */ } u; }; And all of the autogenerated XDR code is altered to use the 'u' field instead of foo_type_u. However, if a 'default:' arm is defined in the definition for the RPC-L union, the autogenerated XDR code still tries to reference the non-brief name (e.g. foo_type_u). This causes a build failure when actually trying to compile the generated .xdr.c, like so: foo.xdr.c:809:39: error: 'foo_type' has no member named 'foo_type_u' if (!xdr_bytes(xdrs, (char **)&objp->foo_type_u.xxx, &__len, FOO_MAX)) { ^ foo.xdr.c:812:11: error: 'foo_type' has no member named 'foo_type_u' *(&objp->foo_type_u.xxx) = __len; This happens because the portion of emit_union() that generates the XDR code for the default arm wasn't updated to use a different formatting string when 'brief_flag' is set, like the rest of emit_union. To fix this, just check for brief_flag and use 'briefformat' accordingly, like the other code that checks for brief_flag. Currently nothing in the tree uses the default arm of RPC-L unions with 'rxgen -b', but external callers could, or our future code may do so. Change-Id: Ifcebfc48a3a64c68fee12ba0d177ae19b0956c58 Reviewed-on: https://gerrit.openafs.org/14107 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 8c335182115a1e16c66cde40c08ce9fd0144dccb Author: Marcio Barbosa Date: Thu Feb 27 22:28:14 2020 +0000 ubik: death to SVOTE_GetSyncSite The SVOTE_GetSyncSite RPC was intended to provide the IP address of the current sync-site. Unfortunately, the RPC-L incorrectly defined ahost as an input argument instead of an output argument. As a result, the IP address in question is not returned to the callers of SVOTE_GetSyncSite. Moreover, calls to this RPC must be made through connections associated with the VOTE_SERVICE_ID. Sadly, the ubik_Call* functions call SVOTE_GetSyncSite using connections associated with the USER_SERVICE_ID. Consequently, the server getting this request returns RXGEN_OPCODE, meaning that this RPC is not implemented by the service in question. Since RPC arguments cannot be changed without causing compatibility issues between different client / server versions and the RPC in question is being called through the wrong service id, remove SVOTE_GetSyncSite and its callers. Considering that in all versions of OpenAFS calls to this RPC always return RXGEN_OPCODE, no behavior change is introduced by this commit. Also, remove the "chaseCount logic" from the ubik_Call* functions. This logic prevents the loop counter from being moved backwards indefinitely, resulting in an infinite loop. Fortunately, without the VOTE_GetSyncSite() calls this counter cannot be moved backwards more than once. Change-Id: Idd071583e8f67109e003f7a5675de02a235e5809 Reviewed-on: https://gerrit.openafs.org/14043 Reviewed-by: Marcio Brito Barbosa Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit d369f4e5c9f975d370ee1aa7546fe9da80e1e118 Author: Cheyenne Wills Date: Fri Mar 20 12:03:48 2020 -0600 tests: Add cache-t to .gitignore in tests/opr Commit 48fbb45 (opr: Introduce opr_cache) added a new test (cache-t), but did not update the .gitignore file for it. Change-Id: I6de6130257a62f495ac942c05937eb109ce84a75 Reviewed-on: https://gerrit.openafs.org/14102 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 59fef92683da7a8c6888e2f4f5127d7b437ac028 Author: Cheyenne Wills Date: Fri Mar 20 11:54:23 2020 -0600 tests: Add core to .gitignore in tests opr/softsig-t can produce a core file as part of its test. Change-Id: I3bc7e587151e5915038e31887018889a7ffa6993 Reviewed-on: https://gerrit.openafs.org/14101 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 32d35db64061e4102281c235cf693341f9de9271 Author: Marcio Barbosa Date: Thu Feb 13 00:39:00 2020 -0300 vos: take RO volume offline during convertROtoRW The vos convertROtoRW command converts a RO volume into a RW volume. Unfortunately, the RO volume in question is not set as "out of service" during this process. As a result, accesses to the volume being converted can leave volume objects in an inconsistent state. Consider the following scenario: 1. Create a volume on host_b and add replicas on host_a and host_b. $ vos create host_b a vol_1 $ vos addsite host_b a vol_1 $ vos addiste host_a a vol_1 2. Mount the volume: $ fs mkmount /afs/.mycell/vol_1 vol_1 $ vos release vol_1 $ vos release root.cell 3. Shutdown dafs on host_b: $ bos shutdown host_b dafs 4. Remove RO reference to host_b from the vldb: $ vos remsite host_b a vol_1 5. Attach the RO copy by touching it: $ fs flushall $ ls /afs/mycell/vol_1 6. Convert RO copy to RW: $ vos convertROtoRW host_a a vol_1 Notice that FSYNC_com_VolDone fails silently (FSYNC_BAD_STATE), leaving the volume object for the RO copy set as VOL_STATE_ATTACHED (on success, this volume should be set as VOL_STATE_DELETED). 7. Add replica on host_a: $ vos addsite host_a a vol_1 8. Wait until the "inUse" flag of the RO entry is cleared (or force this to happen by attaching multiple volumes). 9. Release the volume: $ vos release vol_1 Failed to start transaction on volume 536870922 Volume not attached, does not exist, or not on line Error in vos release command. Volume not attached, does not exist, or not on line To fix this problem, take the RO volume offline during the vos convertROtoRW operation. Change-Id: I1e417a026ed819fab4435e8992311fcd4f339341 Reviewed-on: https://gerrit.openafs.org/14066 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 957b06984b77cba74bd90217b723220c1844809b Author: Marcio Barbosa Date: Fri Mar 6 15:15:38 2020 +0000 vol: fix namei_ConvertROtoRWvolume return code Commit 8632f23d6718a3cd621791e82d1cf6ead8690978 introduced checks for the return value of snprintf calls in namei_ops. On success, the value returned by this function represents the number of written characters. Unfortunately, the variable used to store this value is the same variable that represents the status code returned by namei_ConvertROtoRWvolume. Consequently, a successful execution of namei_ConvertROtoRWvolume results in a status code different the 0 (and equal to the number of written characters). To fix this problem, set the status code in question back to 0 after a successful execution of namei_ConvertROtoRWvolume. Change-Id: Ic6fd6483f8d94fd64587f8bae249b9d911d846b4 Reviewed-on: https://gerrit.openafs.org/14065 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 38d78e2496c3d242e44bad401ecffe15e3883388 Author: Cheyenne Wills Date: Fri Mar 6 10:00:25 2020 -0700 afs: Clean up compiler warning casting ptr to int In osi_probe.c, the macro 'check_result' casts a pointer to an int which on older Linux kernels (e.g. 2.6.18) produces several lines with the C warning: ... warning: cast from pointer to integer of different size Change the cast from int to long int. Linux 2.6.18 doesn't provide intptr_t or uintptr_t, and stdint.h is not available to kernel modules. But the size of a pointer is the size of a long (see uintptr_t in linux/types.h - Linux 2.6.24+), so change the cast from int to long. Note that the this code by default only gets pulled in for older Linux kernels (e.g. 2.6.18). For newer kernels, ENABLE_LINUX_SYSCALL_PROBING is not defined, and so most of osi_probe.c is not built. Change-Id: If1b41e11c46f4a14ff5127ed4d602485645ddf2a Reviewed-on: https://gerrit.openafs.org/14092 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot Reviewed-by: Andrew Deason commit 57b4f4f9be1e25d5609301c10f717aff32aef676 Author: Andrew Deason Date: Fri Mar 13 13:00:35 2020 -0500 LINUX: Properly revert creds in osi_UFSTruncate Commit cd3221d3 (Linux: use override_creds when available) caused us to force the current process's creds to the creds of afsd during osi_file.c file ops, to avoid access errors in some cases. However, in osi_UFSTruncate, one code path was missed to revert our creds back to the original user's creds: when the afs_osi_Stat call fails or deems the truncate unnecessary. In this case, the calling process keeps the creds for afsd after osi_UFSTruncate returns, causing our subsequent access-checking code to think that the current process is in the same context as afsd (typically uid 0 without a pag). This can cause the calling process to appear to transiently have the same access as non-pag uid 0; typically this will be unauthenticated access, but could be authenticated if uid 0 has tokens. To fix this, modify the early return in osi_UFSTruncate to go through a 'goto done' destructor instead, and make sure we revert our creds in that destructor. Thanks to cwills@sinenomine.net for finding and helping reproduce the issue. Change-Id: I6820af675edcb7aa00542ba40fc52430d68c05e8 Reviewed-on: https://gerrit.openafs.org/14098 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk Reviewed-by: Jeffrey Hutzelman Reviewed-by: Cheyenne Wills Tested-by: Cheyenne Wills commit a0071a30d532520e51262c3b6c194659e95bf389 Author: Andrew Deason Date: Thu Feb 20 09:37:28 2020 -0500 tests: Run more manpage tests by default Ever since commit f0774acd (Introduce TAP tests of man pages for command_subcommand), we've had tests to check that we have man pages for every subcommand in a command suite. This was done for several command suites, including 'bos', and 'fs', but the bos and fs tests were never added to the TESTS file. Add them, so the tests run by default in a 'make check'. Fortunately, the tests still pass today. Change-Id: I90c006845d054fa3e795203bb1deff675e558622 Reviewed-on: https://gerrit.openafs.org/14073 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit e06b47fc0e63eff2098de422628b6c03396d419f Author: Andrew Deason Date: Thu Sep 12 14:36:04 2019 -0500 ubik: Rename flags to dbFlags Rename ubik_dbase->flags to ubik_dbase->dbFlags, to make it easier to distinguish between other fields and variables just called 'flags'. Change-Id: I17258f9a65e989943d066307e332550d66ca7500 Reviewed-on: https://gerrit.openafs.org/13864 Reviewed-by: Andrew Deason Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit e68109013d03829f2e9dc95586933212a0ea9ad7 Author: Andrew Deason Date: Thu Sep 12 12:37:04 2019 -0500 ubik: Clarify UBIK_VERSION_LOCK semantics Commit e4ac552a (ubik: Introduce version lock) added UBIK_VERSION_LOCK and version_data. The commit message mentions that holding either UBIK_VERSION_LOCK or DBHOLD is enough to be able to read the protected items and both locks must be held to modify them, but this isn't mentioned in the actual code. Add a comment explaining these locking rules, to make these rules clearer to readers. Change-Id: I715f89695add6d94e13d6ee1dc6addd1e748d3fd Reviewed-on: https://gerrit.openafs.org/13863 Reviewed-by: Andrew Deason Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 086d185872da5f19447cf5ec7846e7ce5104563f Author: Cheyenne Wills Date: Wed Nov 20 12:43:03 2019 -0700 LINUX: Include linux/time.h for linux/errqueue.h The configuration test for errqueue.h fails with an undefined structure error on a Linux 3.17 (or higher) system. This prevents setting HAVE_LINUX_ERRQUEUE_H, which is used to define AFS_RXERRQ_ENV. Linux commit f24b9be5957b38bb420b838115040dc2031b7d0c (net-timestamp: extend SCM_TIMESTAMPING ancillary data struct) - which was picked up in linux 3.17 added a structure that uses the timespec structure. After this commit, we need to include linux/time.h to pull in the definition of the timespec struct. Change-Id: Ifab79f8454c771276d5fdf443c4d68400b70134a Reviewed-on: https://gerrit.openafs.org/13950 Reviewed-by: Andrew Deason Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 660a0855bb9351a72ef45cd72e02503c86bf2cea Author: Andrew Deason Date: Wed Sep 11 16:42:47 2019 -0500 ubik: Log urecovery_CheckTid-aborted txes Log when urecovery_CheckTid aborts/ends a running remote transaction. This is usually a rare event, occurring when some ubik sites get "stuck" or confused about the state of the quorum. Logging some details when this happens can be useful when investigating issues post-mortem, or just to see why a transaction failed. Change-Id: If0a7cd134aaac3722fe7214a1d8f0efab550ad11 Reviewed-on: https://gerrit.openafs.org/13862 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk commit 091e8e9ca52e408c52e3310588d6c959a517a15c Author: Andrew Deason Date: Fri Aug 23 12:21:54 2019 -0500 ubik: Introduce ubik_CallRock In OpenAFS 1.0, the way we made dbserver RPC calls was to pass the relevant RPC and arguments to ubik_Call()/ubik_Call_New(), which coerced all of the RPC arguments into 'long's. To make this more typesafe, in commit 4478d3a9 (ubik-call-sucks-20060703) most callers were converted to use ubik_RPC_name()-style calls, which used functions autogenerated by rxgen. This latter approach, however, only lets us use the ubik_Call-style site selection code with RPCs processed by rxgen; we can't insert additional code to run before or after the relevant RPC. To make our dbserver calls more flexible, but avoid coercing all of our arguments into 'long's again, move back to the ubik_Call()-style approach, but use actual typed arguments with a callback function and a rock. Call it ubik_CallRock(). With this commit rxgen still generates the ubik_RPC_name()-style stubs, but the stubs just call ubik_CallRock with a generated callback function, instead of spitting out the equivalent of ubik_Call() in the generated code itself. To try to ensure that this commit doesn't incur any unintended extra changes, make ubik_CallRock consist of the generated code that was inside rxgen before this commit. This is almost identical to ubik_Call, but not quite; consolidating these two functions can happen in a future commit if desired. Change-Id: I0c3936e67a40e311bff32110b2c80696414b52d4 Reviewed-on: https://gerrit.openafs.org/13987 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 78049987aa3e84865e2e7e0f3dd3b54d66258e74 Author: Cheyenne Wills Date: Tue Mar 3 15:39:49 2020 -0700 LINUX 5.6: define time_t and use timespec/timespec64 The time_t type and the structure timeval were removed for use in kernel space code in Linux commits: 412c53a680a97cb1ae2c0ab60230e193bee86387 y2038: remove unused time32 interfaces c766d1472c70d25ad475cf56042af1652e792b23 y2038: hide timeval/timespec/itimerval/itimerspec types Add an autoconf test for the time_t type. If time_t is missing, define the time_t type when building the kernel module. Change the vattr structure in LINUX/osi_vfs.h to use timespec/timespec64 instead of the timeval structure. Conditionalize the definition of gettimeofday (needed by rand-fortuna.c) in crypto/hcrypto/kernel/config.h. It is unused by the Linux kernel module and the function uses struct timeval that is no longer available. Change-Id: Idc9a1ded748f833d804164d29c49c9aee26ae8f5 Reviewed-on: https://gerrit.openafs.org/14083 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit b8088b49dec23da19406fcb014e7100695dc8322 Author: Andrew Deason Date: Mon Mar 2 16:17:55 2020 -0600 LINUX: Avoid building rand-fortuna-kernel.o Currently, we build rand-fortuna-kernel.o for libafs on all platforms, even though we only use the fortuna RNG on AIX, DragonFlyBSD, HP-UX, and Irix. Everywhere else, our RAND_bytes() in src/crypto/hcrypto/kernel/rand.c uses osi_readRandom() instead of going through heimdal. Building rand-fortuna.c causes occasional build headaches for the kernel on Linux (see cc7f942, "LINUX: Disable kernel fortuna large frame errors"). The most recent instance of this is that Linux 5.6 removes the definition for struct timeval, which is referenced in rand-fortuna.c. The Linux kernel is constantly changing, and so trying to keep rand-fortuna.c building on Linux seems like a waste of ongoing effort. So, just stop building rand-fortuna-kernel.o on Linux. The original intent of building this file on all platforms was to avoid bitrot, so still keep building rand-fortuna-kernel.o on all other platforms even when it's not used; just avoid it on Linux specifically, the platform that requires the most effort. To accomplish this, move rand-fortuna-kernel.o from AFSAOBJS to AFS_OS_OBJS, and remove it from the Linux-only AFSPAGOBJS. Also remove our configure tests for -Wno-error=frame-larger-than=, since they're no longer used by anything. Change-Id: I0d5f14f9f6ba2bdd7391391180d32383b4da89ed Reviewed-on: https://gerrit.openafs.org/14084 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 48fbb45967381f10df092a1ec18b5fb820387e05 Author: Andrew Deason Date: Fri Sep 20 14:19:23 2019 -0500 opr: Introduce opr_cache Add a simple general-purpose in-memory cache implementation, called opr_cache. Keys and values are simple flat opaque buffers (no complex nested structures allowed), hashing is done with jhash, and cache eviction is mostly random with some LRU bias. Partly based off a different implementation by mbarbosa@sinenomine.net. Change-Id: I16b5988947ff603dfe31613cd7be3908a69264e5 Reviewed-on: https://gerrit.openafs.org/13884 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 4ce922d339777faf647f7129f5ae3f173a7870b1 Author: Andrew Deason Date: Tue Jan 14 10:51:42 2020 -0600 afs: Properly type afs_osi_suser cred arg Currently, afs_osi_suser is declared with a void* argument, even though its only argument is always effectively a afs_ucred_t*. This allows us to call afs_osi_suser with any pointer type without the compiler complaining. Currently, some callers call afs_osi_suser with an incorrectly-typed afs_ucred_t** instead, like so: func(afs_ucred_t **credpp) { afs_ucred_t **acred = *acredpp; /* incorrect assignment */ if (afs_osi_suser(acred)) { /* ... */ } } The actual code in the tree hides this to some degree behind various function calls and layers of indirection (e.g. afs_suser()), but this is effectively what we do. This causes compiler warnings because we are doing incorrect pointer assignments, but the end result works because afs_osi_suser actually uses an afs_ucred_t*. The type confusion makes it very easy to accidentally give the wrong type to afs_osi_suser. This only really matters on SOLARIS, since that is the only platform that actually uses its argument to afs_osi_suser(). To fix all of this, just declare afs_osi_suser as taking an afs_ucred_t*, and fix all of the relevant functions to handle the right type. Change-Id: I1366aedf0f3d7689735a9424c5272233931e3bf2 Reviewed-on: https://gerrit.openafs.org/14085 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 8d90a9d27b0ef28ddcdd3eb041c8a9d019b84b50 Author: Yadavendra Yadav Date: Thu Mar 5 07:21:55 2020 +0000 LINUX: Initialize CellLRU during osi_Init When OpenAFS kernel module gets loaded, it will create certain entries in "proc" filesystem. One of those entries is "CellServDB", in case we read "/proc/fs/openafs/CellServDB" without starting "afsd" it will result in crash with NULL pointer deref. The reason for crash is CellLRU has not been initialized yet (since "afsd" is not started) i.e afs_CellInit is not yet called, because of this "next" and "prev" pointers will be NULL. Inside "c_start()" we do not check for NULL pointer while traversing CellLRU and this causes crash. To avoid this initialize CellLRU during module intialization. Change-Id: I21cbc0e016b384f0ab456c05087384b6ed986b0d Reviewed-on: https://gerrit.openafs.org/14093 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 914193fa31af1f2aa9d755ce2215608b643053d0 Author: Michael Meffie Date: Fri Jan 24 13:40:28 2020 -0500 Cleanup vestiges of old shared library build directories Remove traces of the old shlibrpc and shlibafsauthent build directories, which are no longer needed since the conversion to libtool for building shared libraries. Change-Id: I8dbfdf9908b4a5527470b7cb4b969e7a160cdd51 Reviewed-on: https://gerrit.openafs.org/14045 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 832d0ab3124c481858bc8f440309d431cc74331f Author: Michael Meffie Date: Thu Dec 12 15:58:32 2019 -0500 doc: Replace src/SOURCE-MAP with src/README.md Replace the old and poorly maintained "SOURCE-MAP" file with a markdown formatted README.md file. Try to organize the directories in sections to hopefully make a more useful guide to the source code and build directories. Thanks to Cheyenne Wills and Benjamin Kaduk for suggestions. Change-Id: I50f58aa99453bc3412b60a7591d6957cfa83b5b1 Reviewed-on: https://gerrit.openafs.org/14003 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit df2688cf770ed2fd3f2c782f91fd576f098676cb Author: Michael Meffie Date: Fri Feb 21 10:08:42 2020 -0500 auth: accept a NULL afsconf_dir in afsconf_SetCellInfo again Commit 93b26c6f55245e2187e574eb928f5e0ce66a245e added the cellservDB field to the afsconf_dir structure to track the CellServDB pathname. This commit also changed the afsconf_SetCellInfo() and afsconf_SetExtendedCellInfo() functions to use the new cellservDB member to open the CellServDB file. Unfortunately, the bosserver intentionally calls afsconf_SetCellInfo() with a NULL afsconf_dir pointer when attempting to create the default CellServDB and ThisCell files (e.g., "localcell"), which causes the bosserver to crash on startup when the cell configuration is not present. Fix this by calling the static function to lookup the CellServDB pathname when a afsconf_dir data object is not given. Change-Id: I8d36f7c8afe6b4e13bfd04c421bf1109d1eb4238 Reviewed-on: https://gerrit.openafs.org/14061 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 302a203cf99fc0f11a402a31121cbe306f9bed30 Author: Michael Meffie Date: Thu Feb 20 16:09:49 2020 -0500 auth: pass the directory name to _afsconf_CellServDBPath Change the signature of the _afsconf_CellServDBPath() static function to take just the base directory name of the CellServDB file instead of the entire afsconf_dir data object. This makes it clear we do not need other members of the afsconf_dir structure to compose the CellServDB path. Change-Id: I57509b2ca09123e78df5533d63494c66b5b24cdf Reviewed-on: https://gerrit.openafs.org/14076 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot Reviewed-by: Andrew Deason commit 7c431f7571bbc32b26180086d10932d41d0da08c Author: Michael Meffie Date: Thu Feb 20 15:58:27 2020 -0500 auth: retire writeconfig.c Move the afsconf_SetCellInfo() and afsconf_SetExtendedCellInfo() to the cellconfig.c file with the other afsconf_dir functions. Retire the now empty writeconfig.c file. At one point in the distant past afsconf_SetCellInfo() did not have a afsconf_dir argument, so it probably made sense to have a separate file to write the configuration. Later, the afsconf_dir argument was added to afsconf_SetCellInfo() and afsconf_SetExtendedInfo() to reset the auth cache, so these functions are now better placed in cellconfig.c. Note the contents of writeconfig.c were moved verbatim (including comments), so this commit should have no functional changes. Change-Id: Idff76f0d2dfa2383a8617373f0e38235a94f20f1 Reviewed-on: https://gerrit.openafs.org/14075 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit de031398c652045394adc150faaf0dcb6cf28bc3 Author: Andrew Deason Date: Wed Oct 2 15:14:21 2019 -0500 opr: Define opr_mutex_t in lockstub.h Like we do for opr_cv_t, define an opr_mutex_t to be a plain int, to allow opr mutexes to be defined easily without ifdef guards. Change-Id: Ib90017ac098ebc68ffd89890d448aabb2321f63e Reviewed-on: https://gerrit.openafs.org/13886 Reviewed-by: Benjamin Kaduk Reviewed-by: Michael Meffie Reviewed-by: Mark Vitale Reviewed-by: Cheyenne Wills Tested-by: BuildBot commit 71a825a3d86faeaf69645d5faab1a14558069c4c Author: Benjamin Kaduk Date: Fri Jan 24 21:42:33 2020 -0800 RedHat: support the ppc64le architecture Reported by zhenjiang.cai@powercore.com.cn. FIXES 135065 Change-Id: I79718a8b4da8a73edf40e0221308c9babc5e85b5 Reviewed-on: https://gerrit.openafs.org/14046 Tested-by: BuildBot Reviewed-by: Stephan Wiesand Reviewed-by: Michael Meffie Reviewed-by: Yadavendra Yadav Reviewed-by: Benjamin Kaduk commit cd3221d3532a28111ad22d4090ec913cbbff40da Author: Jeffrey Hutzelman Date: Thu May 2 16:02:47 2019 -0400 Linux: use override_creds when available Linux may perform some access control checks at the time of an I/O operation, rather than relying solely on checks done when the file is opened. In some cases (e.g. AppArmor), these checks are done based on the current tasks's creds at the time of the I/O operation, not those used when the file was open. Because of this, we must use override_creds() / revert_creds() to make sure we are using privileged credentials when performing I/O operations on cache files. Otherwise, cache I/O operations done in the context of a task with a restrictive AppArmor profile will fail. Change-Id: Icbe60874c348d6cd92b0a186d426918b0db9b0f9 Reviewed-on: https://gerrit.openafs.org/13751 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 042f809ccfe12bafed73aa4eb4db2c86737e0b22 Author: Michael Meffie Date: Fri Oct 18 13:43:36 2019 -0400 warn when starting without keys The server processes will happily start without keys and then fail all authenticated access, including database synchronization and local commands with -localauth. At least issue warnings to let admins know the keys are missing and that akeyconvert or asetkey needs to be run. The situation is not helped by fact the filenames of the key files have changed between versions. In 1.6.x the (non-DES) keys were in the rxkad.keytab file and in later versions they are in the KeyFile* files, so if you are used to 1.6.x it is not obvious what is wrong. Change-Id: Iff7fe9a5a5a0f5ea1f4e227d3f6129658f8eb598 Reviewed-on: https://gerrit.openafs.org/13911 Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit a5f031d2fe50f068f5517ff8d64324c127b6420d Author: Mark Vitale Date: Wed Feb 19 14:48:07 2020 -0500 improve command-line help for --enable_peer_stats The command-line help for several OpenAFS servers lists an inaccurate description for the --enable_peer_stats option: "enable RX transport statistics" Improve the help description to be more clear and consistent with the description for --enable-process-stats. Introduced by the following commits: cd3492d volser: Convert command line parsing to cmd a5effd9 viced: Use libcmd for command line options 461603e vlserver: Use libcmd for command line parsing 0b9986c ptserver: Use libcmd for command line parsing Change-Id: Ibe23c61d4b838f3a3185390b18d25494fffde2ca Reviewed-on: https://gerrit.openafs.org/14072 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 1626986bd6d70c526376cf7cedfd3ebbf6d3588a Author: Cheyenne Wills Date: Tue Feb 11 11:29:42 2020 -0700 LINUX 5.6: use struct proc_ops for proc_create The Linux commit d56c0d45f0e27f814e87a1676b6bdccccbc252e9 (proc: decouple proc from VFS with "struct proc_ops") was merged into Linux 5.6rc1. The commit replaces the 'file_operations' parameter for proc_create with a new structure 'proc_ops'. Conditionally initialize and use proc_ops structures instead of file_operations structures for calls to proc_create. Notes: * proc_ops.proc_ioctl is equivalent to file_operations.unlocked_ioctl * The macros HAVE_UNLOCKED_IOCTL and HAVE_COMPAT_IOCTL are both hardcoded to 1 in linux's fs.h * proc_ops.compat_ioctl is conditional on Linux's CONFIG_COMPAT macro which is a separate test from the HAVE_COMPAT_IOCTL macro Change-Id: I8570ca499696b4c31b381543107453fbfe355376 Reviewed-on: https://gerrit.openafs.org/14063 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit 6d6a28720f4eae4652f2628fdfcc30983916f39d Author: Marcio Barbosa Date: Fri Feb 7 14:58:56 2020 -0300 macos: add anchors to synthetic.conf grep pattern The grep pattern that checks if /etc/synthetic.conf already has an entry for afs is intended to check if this file holds a single column entry named afs. Unfortunately, the current version does not completely enforce this restriction. To fix this problem, add anchors to the grep pattern in question. Change-Id: I15a1fa1c250027b7d3ab67e686cbfbae853251a2 Reviewed-on: https://gerrit.openafs.org/14062 Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Yadavendra Yadav Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 09ec1073b4c5d2eb70dcf5d8063018bc82e5a35e Author: Mark Vitale Date: Sun Jan 26 20:17:40 2020 -0500 afs: silence bogus warning about dcListCount uninitialized Commit 3be5880d1d2a0aef6600047ed43d602949cd5f4d 'afs: Avoid panics in afs_InvalidateAllSegments' is correct, but at least one compiler (gcc 4.3.4 on SLES 11.3) is fooled into issuing a warning: [...]/afs_segments.c: In function 'afs_InvalidateAllSegments_once': [...]/afs_segments.c:506: error: 'dcListCount' may be used uninitialized in this function To silence the bogus warning, initialize dcListCount when defined. Change-Id: I5938c85c71d08ed61ec1f69a50afb19c9b31fa82 Reviewed-on: https://gerrit.openafs.org/14048 Tested-by: BuildBot Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit 9238b1eb9ef02889855eaade76e5b7962e5f2f28 Author: Michael Meffie Date: Mon Jul 22 15:20:24 2019 -0400 vos: fix name availability check in vos rename The UV_RenameVolume() function first updates the volume name in the VLDB, then read-write volume header and backup volume header, and finally all of the read-only volume headers. If this function is interrupted or a remote site is not reachable, the names in some of the volume headers will be out of sync with name in the VLDB entry. The implementation of UV_RenameVolume() is idempotent, so can be safely called with the same name as in the volume's VLDB entry. This could be used to bring all the names in the volume headers in sync with the name in the VLDB. Unfortunately, due to the check of the -newname parameter, vos rename will not invoke UV_RenameVolume() when the name in the VLDB has already been changed. The vos rename command attempts to verify the desired name (-newname) is available before invoking UV_RenameVolume() by simply checking if a VLDB entry exists with that name, and incorrectly assumes when a VLDB entry exists with that name it is an entry for a different volume. Change the -newname check to allow vos rename to proceed when name has already been set in the VLDB entry of the volume being renamed. This allows admins to run vos rename command to complete a previously incomplete rename operation and bring the names in the volume headers in sync with the name in the VLDB entry. Note: Before this commit, administrators could workaround this vos rename limitation by renaming the volume twice, first to an unused volume name, then to the actual desired volume name. Remove the useless checks of the code1 return code after exit in the RenameVolume() function. These checks for code1 are never performed since the function exits early when the first VLDB_GetEntryByName() fails for any reason. Update the vos rename man page to show vos rename can be used to fix previously interrupted/failed rename. Also document the -oldname parameter accepts a numeric volume id to specify the volume to be renamed. Change-Id: Ibb5dbe3148e9b8295347925a59cd7bdbccbe8fe0 Reviewed-on: https://gerrit.openafs.org/13720 Reviewed-by: Cheyenne Wills Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 6c54bc9e121b923ec5fdd60ee510171987e55017 Author: Mark Vitale Date: Mon Jan 27 12:26:41 2020 -0500 uss: more gcc9 truncation warning appeasement uss_procs_PickADir needs a larger buffer to avoid a truncation warning. While here, replace some magic numbers with existing symbols. Change-Id: If981dddfa50bdbc8c4730cf8038429f071b1d5be Reviewed-on: https://gerrit.openafs.org/14049 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit bf1b3e2fc12a7502cfd74eb109eeb7131f7230d3 Author: Michael Meffie Date: Fri Jan 10 10:54:20 2020 -0500 tests: skip vos tests when a vlserver is already running The vos tests start a temporary vlserver process, which is problematic when the local system already has an installed vlserver. Attempt to temporarily bind a socket to the vlserver port, and if unable to bind with an EADDRINUSE error, assume the vlserver is already running and skip these tests. Change-Id: I1dd3bc4c7ebcd2c7bffc8aca422222a50058090e Reviewed-on: https://gerrit.openafs.org/14021 Reviewed-by: Cheyenne Wills Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 6d309f86089ea707dbeb6ab553e3dfd23b6c338c Author: Andrew Deason Date: Thu Jan 9 12:28:57 2020 -0600 afs: Remove osi_VMDirty_p The function osi_VMDirty_p is mentioned in a few places in src/afs, but it has always been ifdef'd or commented out, ever since OpenAFS 1.0. Remove the dead code. Change-Id: Ia7cad718114d91adf9e403e29f9ac976c3f08bfd Reviewed-on: https://gerrit.openafs.org/14023 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 6ee2d6de7d87c93c849f3afbe4326906e4c10852 Author: Andrew Deason Date: Thu Jan 9 12:38:45 2020 -0600 aklog: Make dummy write AIX-specific This weird write() call exists to work around some old AIX-specific bug. The ifdef looks like it is intended to restrict this to pre-5 AIX, but it also turns this on for all non-AIX platforms. Make this area AIX-specific, to avoid this weird write on other platforms that have nothing to do with the relevant workaround. Change-Id: I092bcadb4ecc6277ae01e44e6a957e6bacc0cf2d Reviewed-on: https://gerrit.openafs.org/14022 Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit dcf44ab5fc5c1f5e2e759ea4b6156f7e1faa4b7a Author: Michael Meffie Date: Fri Jan 10 09:06:38 2020 -0500 tests: do not resolve addresses in vos/vl test The vos-t test adds a set of 10.* test addresses to a test vlserver and runs vos to read them back. When the test is run in an environment where hosts have been assigned in the 10.* internal network, vos will resolve the addresses to hostnames and the test fails. Pass the -noresolve option to vos for this test when checking for the expected list of addresses. Example test output before this commit: ./vos-t ... # seen: 10.0.0.0 10.0.0.1 myhost.example.com 10.0.0.3 ... not ok 5 - vos output matches Change-Id: Ief43fe180a0dfff211f28d5f47be6224270907a3 Reviewed-on: https://gerrit.openafs.org/14020 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 37c5db3ce767868803135c916b282ff2e541d052 Author: Andrew Deason Date: Sun Dec 1 15:39:04 2019 -0600 FBSD: Declare vnops/vfsops static Declare our vnode and vfs operations as static functions, since they are not referenced outside of osi_vfsops.c/osi_vnodeops.c. Shuffle around the definitions in osi_vnodeops.c so that we don't need forward declarations for the functions. Change-Id: Idbbe05a8b248ac29c2795c365be6a4e99da536dd Reviewed-on: https://gerrit.openafs.org/13973 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit a4e9365fff2b0e3daf7e9cf2b40e6027b7dd3a15 Author: Andrew Deason Date: Sun Dec 1 15:27:01 2019 -0600 FBSD: Remove support for 8.x and 9.x According to , FreeBSD 8.x EoL was on August 1, 2015, and FreeBSD 9.x EoL was on December 31, 2016. Remove our support for these versions, since they haven't been supported by FreeBSD itself for a while. FreeBSD 10.x EoL was on October 31, 2018, which has passed, but was less than a year ago. So keep 10.x in for now. Adjust our preprocessor checks accordingly: - In FBSD-specific dirs, assume AFS_FBSD100_ENV and lower is always true. Assume __FreeBSD_version is always at least 1000000. - In non-FBSD dirs, convert AFS_FBSD100_ENV and lower to AFS_FBSD_ENV. Change-Id: I965e65d3b95573bb374661217b24b686c7b68ed2 Reviewed-on: https://gerrit.openafs.org/13842 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit eab0bb0af87e9309bfb6b754f3521d24288bd933 Author: Andrew Deason Date: Wed Jan 1 20:25:05 2020 -0600 tests: Explicitly build target 'all' by default Commit 68f40643 (Build tests by default) added new targets in our top-level Makefile, that caused us to effectively run 'cd tests && make' as part of the default build. Since no explicit target is provided, 'make' tries to build the first target in the given Makefile. On some platforms (such as *BSD), 'make' finds the first defined target as a pattern rule (%.c) from our included makefiles, and tries to build the target %.c, which it cannot do. This causes the build to fail with: cd tests && make make[3]: don't know how to make %.c. Stop To fix this, just explicitly build the 'all' target when we build our tests by default. Change-Id: I319271482685ec35087c470d95fdcaec6e1d8c47 Reviewed-on: https://gerrit.openafs.org/13993 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot Reviewed-by: Michael Meffie commit ce7a76a13e4009262dc42a6c93c371fb26116d41 Author: Andrew Deason Date: Tue Dec 31 12:25:32 2019 -0600 tests: Stop vlserver on errors Currently, if we encounter an error and 'goto out' after starting the test vlserver, we'll exit without stopping the test vlserver. This can confuse the test harness, causing 'runtests' to hang forever. To avoid this, move the afstest_StopServer() call to also run when we're bailing out, but only if the server has actally started of course. Change-Id: Ice5a56c20bc8d2eac85b3e760850c4d85e4601a8 Reviewed-on: https://gerrit.openafs.org/13992 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot Reviewed-by: Michael Meffie commit a21a2f8edb79d6190976e920a9a90d0878411146 Author: Andrew Deason Date: Tue Dec 31 12:04:48 2019 -0600 tests: Introduce afstest_GetProgname Currently, in tests/volser/vos-t.c we call afs_com_err as "authname-t", which is clearly a mistake during some code refactoring (introduced in commit 2ce3fdc5, "tests: Abstract out code to produce a Ubik client"). We could just change this to "vos-t", but instead of specifying constant strings everywhere, change this to figure out what the current command is called, and just use that. Put this code into a new function, afstest_GetProgname, and convert existing tests to use that instead of hard-coding the program name given to afs_com_err. Change-Id: I3ed02c89f93798568783c7d717e8fb2e39dcce14 Reviewed-on: https://gerrit.openafs.org/13991 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 48d181ca1f4d753a51305d0352dadefed4323c00 Author: Andrew Deason Date: Tue Jan 7 13:02:21 2020 -0600 libtool: Serialize building libfoo.la and libfoo.a We have a few libraries where we have separate targets to build libfoo.la (to get libfoo.so) and libfoo.a. Currently, these targets can be built in parallel, and both are built with libtool. This can cause problems because of two behaviors with libtool: - When running --mode=link for libfoo.a or libfoo.la, it effectively runs 'rm -rf .libs/libfoo.*' to clean up its work area. - When running --mode=link for libfoo.a, libtool sets up some scratch space in .libs/libfoo.ax to unpack various static libs. So when 'make libfoo.a' is running, libtool creates a .libs/libfoo.ax dir, and unpacks various object files inside of it. If while that is running, 'make libfoo.la' runs, it causes libtool to remove that directory and all its contents. This causes 'make libfoo.a' to fail with confusing messages like this (for libafsrpc.a): /bin/sh ../../libtool --quiet --mode=link --tag=CC gcc -static -O -o libafsrpc.a [...] find: '.libs/libafsrpc.ax/libopr_pic.a': No such file or directory ar: .libs/libafsrpc.ax/libfsint_pic.a/afscbint.cs.o: No such file or directory make[3]: *** [Makefile:59: libafsrpc.a] Error To avoid this, prevent building libfoo.la and libfoo.a at the same time, by just making libfoo.la depend on libfoo.a. Do this for all of the libraries we build in this way: libafshcrypto, libkopenafs, libafsauthent, and libafsrpc. Change-Id: I821768b3b4cd99cf5bf98605068773347ada0fb2 Reviewed-on: https://gerrit.openafs.org/14017 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 057f848a9c7b12afbe6563878760c1eab64b99b3 Author: Andrew Deason Date: Fri Nov 1 15:19:23 2019 -0500 ubik: Introduce ugen_secproc_func We currently specify the signature of the 'secproc' function callback in multiple places. Consolidate them into a single typedef. Change-Id: Ic785f47fc726bff6c37f7fd826f1e2626d006776 Reviewed-on: https://gerrit.openafs.org/13986 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 86170750dd2cc49781fad53e539d67f4c1ed0a84 Author: Andrew Deason Date: Wed Oct 9 13:54:40 2019 -0500 doc: Document new rxgk options Commit e5b1e6f1 (Add rxgk client options to vl and pt utilities) added a couple of new command-line options related to rxgk, but didn't add them to the relevant man pages. Add a brief description of these new options to the manpages for pts, vos, ptserver, and vlserver. Change-Id: I2d9bfdeb0a31d396740ca2a4d42e14c025b6f79e Reviewed-on: https://gerrit.openafs.org/13947 Reviewed-by: Cheyenne Wills Reviewed-by: Andrew Deason Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit bebae936b4ef3bf47624c0ff0baae5521bad804e Author: Cheyenne Wills Date: Thu Jan 2 11:18:16 2020 -0700 afs: Fix EIO error when reading a 4G or larger file When reading a file with a file length of >= 4G, the cache manager is failing the read with an EIO error. In afs_GetDCache, the call to IsDCacheSizeOK is passed a parameter that contains only the lower 32bits of the file length (which requires a 64 bit value). This results in the EIO error if the length is over 2^32 -1. The AFSFetchStatus.Length member needs to be combined with the AFSFetchStatus.Length_hi to obtain the full 64bit file length. Fix the calls to IsDCacheSizeOK to use the full 64bit file length. Commit "afs: Check dcache size when checking DVs 7c60a0fba11dd24494a5f383df8bea5fdbabbdd7" - gerrit 13436 - added the IsDCacheSizeOK function and the associated calls. As a note, the AFSFetchStatus.DataVersion is the lower 32 bits of the full 64bit version number, AFSFetchStatus.dataVersionHigh contains the high order 32bits. The function IsDCacheSizeOK is passed just the 32bit component, the only use of the parameter is in an error message. Change-Id: Idbe6233bd6ef792ed2b92d9337aba334e23f1452 Reviewed-on: https://gerrit.openafs.org/14002 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit daf6616aab6732d6b417c15f6f401731ef8e44b5 Author: Marcio Barbosa Date: Sat Dec 21 19:56:41 2019 -0800 macos: add entry for afs into synthetic.conf The root mount point is read-only as of macOS 10.15. As a result, /afs cannot be created at this location. To workaround this restriction, macOS 10.15 provides an alternative way to create mount points at the root. To make it possible, an entry for the mount point in question must be added to /etc/synthetic.conf. The synthetic entities described in this file are not physically present on the disk. Instead, they are synthesized by the kernel during system boot. This commit adds an entry for afs into the file mentioned above. Knowing that this change only takes effect after reboot, also provide directions to the user during the installation process. Change-Id: I7a05f4b9a48e443dbaa20a624a92b8b54c510000 Reviewed-on: https://gerrit.openafs.org/13928 Tested-by: BuildBot Reviewed-by: Yadavendra Yadav Reviewed-by: Benjamin Kaduk commit 0563642cc1cb750c69a6471005adf36fabb2b7e3 Author: Marcio Barbosa Date: Sat Dec 21 19:11:57 2019 -0800 macos: add script to notarize OpenAFS In order to integrate the notarization process into our existing build scripts, this patch introduces a script to automatically notarize the OpenAFS package. Change-Id: Ia9743cd39485e68de540b79b165b9d92020ad187 Reviewed-on: https://gerrit.openafs.org/13671 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 10d176afd23bbf684017a7946dffb1d592ea04fa Author: Andrew Deason Date: Wed Oct 23 15:46:16 2019 -0500 Do not build shared-only libs for --disable-shared Commit 0f1e54c4 (Pass -shared when linking some shared libraries) changed some of our linking rules to pass -shared to libtool when linking. When building with the --disable-shared configure option, this causes those linker rules to fail, since shared libraries are disabled. Before commit 0f1e54c4, we could build with --disable-shared successfully. To allow us to build again with --disable-shared, just don't build the relevant shared-only libraries at all, when shared libraries are disabled. To accomplish this, introduce a new substitution variable, SHARED_ONLY, which allows certain lines in Makefiles to become commented-out when shared libraries are disabled. Update all of the shared-only libraries to be built conditionally based on this variable. Except for libuafs.la, which appears to be not referenced by anything. Just remove the rules for that instead. Change-Id: I82084a08d2f9c12ca438bd7b1626e1376159c975 Reviewed-on: https://gerrit.openafs.org/13927 Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit d0941e81b2f1f499cebb57d8a81d82802913d9be Author: Andrew Deason Date: Fri Oct 25 19:04:44 2019 -0500 pts: Use cmd_AddParmAtOffset for common parms Update pts to use cmd_AddParmAtOffset and symbolic constants for our common parameters, instead of using bare literals like '16'. Change-Id: Ib8fe77983a6bba46c3182585774e067512449f0e Reviewed-on: https://gerrit.openafs.org/13946 Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 90726f837cd03a4eef745ab6bc221987042a72a6 Author: Andrew Deason Date: Tue Oct 29 20:17:39 2019 -0500 tests: Check if vlserver died during startup Currently, the volser/vos test starts a local vlserver to communicate with. If the vlserver dies during startup, the spawned 'vos' subprocesses take forever to run, since we need to wait for our Rx calls to timeout for every operation. To make it less annoying to detect and investigate errors that might cause the vlserver to fail during startup, check if the vlserver dies right away. We already sleep for 5 seconds when starting the vlserver, so just check if the pid still exists after those 5 seconds. Change-Id: I6c33059542fa975e4cb389b718f9da190cd13289 Reviewed-on: https://gerrit.openafs.org/13942 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 94acb9f36b2e14d24a485e016ec7ab264115c0be Author: Andrew Deason Date: Mon Sep 9 14:27:40 2019 -0500 rx: Make rx_identity_free idempotent rx_identity_free sets the given identity to NULL, but it unconditionally derefs the given identity. Make it a no-op for NULL identities, to make related cleanup code and destructors simpler. Change-Id: I863c72be71fb4b3056a2cd8fc2bf19cfb2d5dfbb Reviewed-on: https://gerrit.openafs.org/13945 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit d3d2530691a0d5e45e6752d5cc012357ecbd410e Author: Andrew Deason Date: Wed Aug 21 12:43:03 2019 -0500 rx: Make rx_opaque_free idempotent Currently rx_opaque_free sets the given argument to NULL, a style that helps prevent double-frees. However, it doesn't check if the given buffer is already NULL, which makes potential callers that use a 'goto done'-style cleanup block do something like: done: if (buf) rx_opaque_free(&buf); To avoid the extra if(), make rx_opaque_free a no-op if it's given a NULL buffer, similar to how free(NULL) is a no-op on most platforms. Slightly refactor how we reference our argument as well, to limit the number of layers of indirection the code needs to deal with. Do the same for rx_opaque_zeroFree. Note that there are currently no callers of rx_opaque_free/rx_opaque_zeroFree, but future commits will add some. Change-Id: Ic86a9c63903bebbddd311912cfbcb61198e3f0b0 Reviewed-on: https://gerrit.openafs.org/13944 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 71bf9ac08c1dd7566fd5d6b438293614afdc1d13 Author: Andrew Deason Date: Mon Sep 23 22:43:30 2019 -0500 ptserver: Fix WhoIsThisWithName indentation Many lines in this block in WhoIsThisWithName are oddly indented by 1 more space than usual. Fix them. Change-Id: I5e3ec4974cebc694c7b02c1ea6e037d4ec335a12 Reviewed-on: https://gerrit.openafs.org/13943 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 68f406436cc21853ff854c514353e7eb607cb6cb Author: Andrew Deason Date: Tue Oct 29 17:22:04 2019 -0500 Build tests by default While it's not feasible to run all of our tests by default during the build, we should be able to at least make sure the tests can build. So, make the default build targets also build our tests, by making the 'finale' target build the tests. Change-Id: Ieadd48ba2774526de8a13136e6cc8a50434ed2f5 Reviewed-on: https://gerrit.openafs.org/13941 Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 0b8b6683fb525bbeaf118014beb2371e0cf23d90 Author: Andrew Deason Date: Mon Nov 11 20:34:27 2019 -0600 tests: Fix manpage tests for objdir builds The manpage tests have a couple of problems when running for objdir builds: - We try to specify './tests-lib/perl5' as a directory to find our helper library. However, the cwd when we're running the tests is in an objdir build, where the helper library is in the srcdir. Fix this by using the SOURCE env var specified by the tests wrapper. - All of these tests specify the directory in which to find the man pages in a subdir of BUILD, but our manpages are located in the src dir (since they are built by regen.sh, not by configure/make). Fix this by specifying a SOURCE-based directory instead. To avoid needing to make the same change for each of these tests, also refactor the manpage tests so each test only needs to specify the subdirectory and command name, and get rid of some of the common boilerplate. Change-Id: I96be199b1dec8db0545ae3cf19d2595c4afe4cdd Reviewed-on: https://gerrit.openafs.org/13940 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 63fd13bf9e6af21136007c9980816875ebea5f7c Author: Marcio Barbosa Date: Tue Nov 26 11:41:36 2019 -0800 macos: prepare for notarization With the public release of macOS 10.14.5, all new and updated kernel extensions must be notarized by Apple. To be taken into consideration, all executables must be signed and the Hardened Runtime capability must be enabled. This patch adds the missing prerequisites mentioned above. Change-Id: I2d3ad66cb7ce062b91d0616955f3bc2b06ca5822 Reviewed-on: https://gerrit.openafs.org/13670 Reviewed-by: Cheyenne Wills Reviewed-by: Andrew Deason Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit c7864b73603842b8beaee03fcbb2426890205410 Author: Marcio Barbosa Date: Fri Jun 28 00:40:55 2019 -0300 macos: packaging support for MacOS X 10.15 This commit introduces the new set of changes / files required to successfully create the dmg installer on OS X 10.15 "Catalina". Change-Id: I628a3210fa42b2f34ff78030930f83e836775392 Reviewed-on: https://gerrit.openafs.org/13669 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 93815caabc92acc6edc62b72805b44d2e46748cf Author: Marcio Barbosa Date: Mon Nov 18 06:34:08 2019 -0800 macos: add support for MacOS 10.15 This commit introduces the new set of changes / files required to successfully build the OpenAFS source code on OS X 10.15 "Catalina". Change-Id: I849d4c837bf9ae36fe5c33356bc1c66a2fc513ac Reviewed-on: https://gerrit.openafs.org/13668 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit d4302d42149988fa6d04d626967063dfa916c9fd Author: Marcio Barbosa Date: Thu Dec 12 19:03:04 2019 -0800 macos: upgrade *.xib files According to Xcode 11, the *.xib files updated by this commit use an older format that is potentially insecure when decoded. To fix this problem, Xcode automatically upgraded these files to the modern format. These changes are required to build OpenAFS on Catalina (Xcode 11). Change-Id: Ica8c464eff93496d87fc854b193bfb0dad07a3c2 Reviewed-on: https://gerrit.openafs.org/13935 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 677b038814817defec9421e698ce67b44a7fd7d1 Author: Marcio Barbosa Date: Thu Nov 7 23:56:13 2019 -0300 macos: tell the compiler the system include path In order to support multiple SDKs, macOS Catalina no longer has the /usr/include directory. As a result, the compiler needs to know where these headers can be found. To successfully build OpenAFS on OSX 10.15, set KROOT so the compiler knows the correct location of these headers. Change-Id: I5ef33b34b6a4e6111983a63a2d34326ca4af9d30 Reviewed-on: https://gerrit.openafs.org/13936 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit f4ab3767b7e65028b93e731da6f09ee385c51daf Author: Andrew Deason Date: Mon Nov 11 20:34:07 2019 -0600 tests: Fix most tests for objdir builds Fix a few miscellaneous issues with building and running our tests in objdir builds: - Our C tests use -I$(srcdir)/../.. in the CFLAGS, so we can #include . However, basic.h actually gets copied from src/external/c-tap-harness/tests/tap/ to tests/tap/ during the build, and so basic.h is available in the objdir, not srcdir. For objdir builds, this causes building the tests to fail with failing to find basic.h. Fix this to use TOP_OBJDIR as the include path instead. - Our 'make check' in tests/ tries to run ./libwrap; but our cwd will be in the objdir for objdir builds, and libwrap is a script in our srcdir. Fix this to run libwrap from the srcdir path. - In tests/opr/softsig-t, it tries to find the 'softsig-helper' binary in the same dir as 'softsig-t'. However, softsig-t is just a script in the srcdir, but softsig-helper is a binary built in the objdir. Fix this to use the BUILD env var provided by the tests wrapper, by default. Change-Id: Iff642613bfc88d0d7e348660dc62f59e6fa8af75 Reviewed-on: https://gerrit.openafs.org/13939 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 847b63af92dd527de31675a0c3c82c9a57e6c4b3 Author: Andrew Deason Date: Sun Aug 25 23:21:23 2019 -0500 FBSD: Remove pre-8 code Commit 123f0fb1 (config: remove support for old FreeBSD releases) removed our support for FreeBSD releases before FreeBSD 8. However, various areas of code still reference the symbols from those old versions (e.g. AFS_FBSD53_ENV). Remove our ifdef logic for these old symbols, according to the following rules: - In FBSD-specific dirs, assume AFS_FBSD80_ENV is always true (as well as the symbols for earlier versions) - In non-FBSD dirs, convert AFS_FBSD80_ENV to AFS_FBSD_ENV (and do the same for all earlier versions) This allows us to remove code that was specific to older FreeBSD versions, and simplify some ifdef conditionals. Also remove the definitions for AFS_FBSD80_ENV and earlier versions in our existing param.h files. With this commit, the functions afs_start, afs_vop_lock, afs_vop_unlock, and afs_vop_islocked are now always unreferenced, so remove them. Change-Id: Ia5a5ba5ee5b71a86cb4514305e20f1bb34487100 Reviewed-on: https://gerrit.openafs.org/13812 Tested-by: BuildBot Reviewed-by: Tim Creech Reviewed-by: Benjamin Kaduk commit f9c716fca1becea5a41fbe86535759ef817c924d Author: Yadavendra Yadav Date: Fri Dec 6 15:23:34 2019 +0530 afs: Add ppc64le changes in osconf.m4 file. If swig package is installed on a ppc64le system, build fails for "libuafs" while running "shlib-build". "shlib-build" gets executed for builing ukernel.so and this is triggered if "LIBUAFS_BUILD_PERL" is not empty. Having "swig" package on system sets "LIBUAFS_BUILD_PERL" to 'LIBUAFS_BUILD_PERL' value. The reason for build failure was inside "shlib-build", 'linker' was not set (it was empty). 'linker' value is set based on SHLIB_LINKER, which was not defined in osconf.m4 if build system is ppc64le. To fix this add ppc64le_linux26 case in osconf.m4 file. Change-Id: I79d2f78b2af34207c81f4f5ab05fdc387404acad Reviewed-on: https://gerrit.openafs.org/13980 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit d79a8e13e5c1f6d1cf13a308ea506609b578ed84 Author: Cheyenne Wills Date: Mon Dec 2 13:12:00 2019 -0700 util: Use a struct for afsUUID_to_string Replace the use of a character array with a structure that contains the size of the buffer that is needed. This allows the C compiler to perform a type check to ensure the correct sized buffer is used. In addition, the size of the buffer is now specified in just one location. Change the signature of the afsUUID_to_string function to return a pointer to the start of a formatted UUID. This allows the use of afsUUID_to_string in a way that is consistent with other object formatting functions: struct uuid_fmtbuf uuidstr; printf("... %s ...", afsUUID_to_string(uuid, &uuidstr)); Update callers to use the new uuid_fmtbuf struct when calling afsUUID_to_string. Change-Id: I6d6f86ce6c058defc6256e8e88dee4449dd4f7e6 Reviewed-on: https://gerrit.openafs.org/13831 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit f5f8b9336919debc5c26c429b12a14b65e0b697c Author: Marcio Barbosa Date: Thu Nov 14 17:29:56 2019 -0300 viced: add opt to allow admin writes on RO servers Add the new option -admin-write to allow write requests from superusers on file servers running in readonly mode (-readonly). This lets sites run fileservers in readonly mode for normal users, but allows members of the system:administrators group to modify content. Change-Id: Id8ed3513a748815c07cb98e426c1d21ac300b416 Reviewed-on: https://gerrit.openafs.org/13707 Reviewed-by: Andrew Deason Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 7cdf1a93cfdfd4a0959200197f000679199abbd4 Author: Andrew Deason Date: Fri Nov 29 11:42:47 2019 -0600 afs: Skip checking chunkBytes sanity for RW files Currently, the IsDCacheSizeOK check can trigger a false positive for a dcache, if the data in the dcache was populated by a local write to a file that was later extended with sparse data. For example: say a client opens a new file, and writes 4 bytes to offset 0, and then writes 4 bytes to offset 0x400000. After the first write, the first chunk for the file will contain just 4 bytes, and after the second write, the first chunk is unchanged (since we're writing to a different area of the file), but the file is now 0x400004 bytes long. The sparse area of the file will be correctly filled with zeroes for local reads and on the fileserver, but the 4-byte chunk causes IsDCacheSizeOK to complain and mark the dcache as invalid. Even though nothing is wrong, this causes the following scary messages to potentially appear in the kernel log, and the relevant dcache to be invalidated: afs: Detected corrupt dcache for file 1.536870913.2.2: chunk 0 (offset 0) has 4 bytes, but it should have 131072 bytes afs: (dcache 0xfffffdeadbeefb4d, file length 4194308, DV 1, dcache mtime 1575049956, index 996, dflags 0x2, mflags 0x0, states 0x4, vcache states 0x1) afs: Ignoring the dcache for now, but this may indicate corruption in the AFS cache, or a bug. It's probably difficult or impossible to detect if this specific case is happening, so to avoid this scenario, just avoid doing the size check at all for RW data from the cache. Change-Id: Ia40ec838c525d9abc13a03be39028e4ca04a9457 Reviewed-on: https://gerrit.openafs.org/13969 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 0593017177edd5b3bc6609d9dfcce55f15bba3e9 Author: Marcio Barbosa Date: Thu Nov 14 01:15:47 2019 -0300 viced: prevent writes on readonly fileservers Currently, a fileserver can be initialized as readonly. In this mode, writes on this server should not be allowed. Unfortunately, updates on files stored by readonly fileservers are not completely prevented. In some situations, the check for RO server is omitted (e.g. if the user is the owner of the file to be updated). In other situations, the same check is redundant. To fix these problems, consolidate this check in one place. Change-Id: Id53e15216404dfe691a87c7b4964ff08924c262c Reviewed-on: https://gerrit.openafs.org/13934 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 2ae2a15c9dc9b26eaa15964cc96fdeeb6d82c74c Author: Marcio Barbosa Date: Mon Jun 6 14:03:54 2016 -0300 sys: retry lsetpag if errno is EINTR The variable errno might be set by some system calls to indicate the reason why the system call in question did not work as expected. If the setpag system call is interrupted by a signal, the value of errno will be EINTR. This value means that setpag did not succeed because it was interrupted. If lsetpag did not succeed and errno is equal to EINTR, try again. Change-Id: Ibf306d62fc8d2fa9ccb0692f9031c5aa659b2bfe Reviewed-on: https://gerrit.openafs.org/12295 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 9563807791e2402f7a214a90e96cf6ed8ea5abfb Author: Marcio Barbosa Date: Thu Nov 7 00:10:12 2019 -0300 afs: afs_pag_wait() makes process unkillable To enforce a maximum average rate of one PAG allocation per second, afs_pag_wait(), called by afs_setpag*(), sleeps until the difference between the current time and pag_epoch gets greater than pagCounter. Unfortunately, this function ignores the code returned by afs_osi_Wait(). As a result, it is not possible to kill the process that requested the new pag while afs_pag_wait() is sleeping. To fix this problem, do not ignore the code returned by afs_osi_Wait(). Change-Id: I6be11a569edcafa6ecdf716e5315fc75f5a128e8 Reviewed-on: https://gerrit.openafs.org/12260 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 9d0854547522f7b2fb1bb7aa876fe9f901674747 Author: Andrew Deason Date: Sun Nov 17 20:58:15 2019 -0600 afs: Ensure CDirty is set during afs_write loop Currently, in afs_write(), we set CDirty on the given vcache, and then write the given data into various dcaches. When writing to a dcache, we call afs_DoPartialWrite, which may cause us to flush the dirty data to the fileserver and clear the CDirty bit. If we were given more than 1 chunk of data to write, we will then go through another iteration of the loop, writing more dirty data into dcaches, but CDirty will not be set. This can cause issues with, for example, afs_SimpleVStat() or afs_ProcessFS(), which use CDirty to determine whether or not to merge in FetchStatus info from the fileserver into our local cache. This can cause our local cache to incorrectly reflect the state of the file on the fileserver, instead of the state of the locally-modified file in our cache. A more detailed example is as follows. Consider a small C program that copies a file, fchmod()ing the destination before closing it: void do_copy(char *src_name, char *dest_name) { /* error checking elided */ src_fd = open(src_name, O_RDONLY); dest_fd = open(dest_name, O_WRONLY|O_CREAT|O_TRUNC, 0755); fstat(src_fd, &st); src_buf = mmap(NULL, st.st_size, PROT_READ, MAP_SHARED, src_fd, 0); write(dest_fd, src_buf, st.st_size); munmap(src_buf, st.st_size); close(src_fd); fchmod(dest_fd, 0100644); close(dest_fd); } Currently, on FBSD, using this to copy a 7862648-byte file, using a smallish cache (10000 blocks) will cause the destination to appear to be truncated, because avc->f.m.Length will be incorrect, even though all of the relevant data was written to the fileserver. On most other platforms such as SOLARIS and LINUX, this is not a problem, since currently they only write one page of data at a time to afs_write(), and so they never hit multiple iterations of the while() loop inside afs_write(). To fix this, just set CDirty on every iteration of the while() loop in afs_write(). In general, we need to set CDirty after calling afs_DoPartialStore() anywhere if the caller continues to write more data. But all callers already do this, except for this one instance in afs_write(). Thanks to tcreech@tcreech.com for helping find occurrences of the relevant issue. FIXES 135041 Change-Id: I0f7a324ea2d6987a576786292be2d06487359aa6 Reviewed-on: https://gerrit.openafs.org/13948 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 4a9078c6bbf51720a5eacf7e6ba21443e5103eee Author: Andrew Deason Date: Tue Nov 5 10:50:01 2019 -0600 afs: Avoid giving wrong 'tf' to afs_InitVolSlot Commit 75e3a589 (libafs: afs_InitVolSlot function) split out a bit of our code that initializes a struct volume into the afs_InitVolSlot function. However, it caused us to almost always pass a non-NULL 'tf' to afs_InitVolSlot, even if the target volume was not found. That is, before that commit, our code roughly did this: for (...; j != 0; j = tf->next) { ...; tf = &staticVolume; if (tf->volume == volid) break; } if (tf && j != 0) { use_tf_data(); } else { use_blank_data(); } The reason for the extra 'j != 0' check after the loop is to see if we hit the end of the volume hash chain, or if we actually found a matching 'tf' in the loop. And after that commit, the code did this: for (...; j != 0; j = tf->next) { ...; if (j != 0) { tf = &staticVolume; if (tf->volume == volid) break; } } if (tf) { use_tf_data(); } else { use_blank_data(); } The check for 'j != 0' was moved to inside the for loop, but 'j' is always nonzero in the loop (otherwise, the for() would exit the loop). This means that if we didn't find a matching 'tf' in the loop, our 'tf' would be non-NULL anyway, and so we'd initialize our volume slot from just the last entry in the hash chain. This means that for volumes that are not found in the VolumeItems file, our struct volume will probably be initialized with arbitrary data from another volume, instead of being initialized to the normal defaults (the 'else' clause in afs_InitVolSlot). This means that the 'dotdot' entry for the volume may be wrong, and so we may report the wrong parent dir for the root of a volume. However, the 'dotdot' entry should be fixed when the volume root is accessed via a mountpoint, so any such issue should be temporary. And of course, on some platforms (LINUX) we don't ever use the 'dotdot' information for a volume, and even on other platforms, often resolving the '..' entry is handled by other means (e.g. shells often calculate it themselves). But some 'pwd' calculations and other '..' corner cases may be affected. To fix this, change the relevant loop so that we only set 'tf' to non-NULL when we actually find a matching entry. Change-Id: I53118960462c0057725e749cbf588e98024217c3 Reviewed-on: https://gerrit.openafs.org/13933 Tested-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 360b9d5d71fb1de142ae4efd4660732476855a3f Author: Andrew Deason Date: Mon Nov 4 20:03:43 2019 -0600 afs: Avoid -1 error for vreadUIO/vwriteUIO Commit c6b61a45 (afs: Verify osi_UFSOpen worked) added various checks to return an error if a given osi_UFSOpen failed. However, two of these checks (in afs_UFSReadUIO and afs_UFSWriteUIO) result in us returning -1 on error, in functions that otherwise return errno codes (e.g. ENOSPC). An error code of -1 might get interpreted as RX_CALL_DEAD, which would be rather confusing, so use EIO as a generic error instead. Change-Id: I23b9a73b82d999d8ee4670b5e7ec39b9d820fb0f Reviewed-on: https://gerrit.openafs.org/13931 Tested-by: BuildBot Reviewed-by: Mark Vitale Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit b3b56d79653566ef1442d296e31beb762d25ce42 Author: Andrew Deason Date: Mon Nov 4 16:10:25 2019 -0600 doc: Fix realm capitalization In this example, krbtgt.Example.COM clearly refers to the principal name converted from krbtgt/Example.COM, and so by convention the realm name would be in all caps. Fix this example to use the all-caps realm name, for consistency. This mistake was introduced by commit 1cc8feb6 (doc: replace hostnames with IETF example hostnames), the realm was in all caps before that commit. Mistake spotted by Chas Williams. Change-Id: Icaf4931868752064c4617c8ad778122e076ae3cb Reviewed-on: https://gerrit.openafs.org/13930 Tested-by: BuildBot Reviewed-by: Mark Vitale Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 6ec46ba7773089e1549d27a0d345afeca65c9472 Author: Andrew Deason Date: Mon Sep 16 14:06:53 2019 -0500 OPENAFS-SA-2019-003: ubik: Avoid unlocked ubik_currentTrans deref Currently, SVOTE_Debug/SVOTE_DebugOld examine some ubik internal state without any locks, because the speed of these functions is more important than accuracy. However, one of the pieces of data we examine is ubik_currentTrans, which we dereference to get ubik_currentTrans->type. ubik_currentTrans could be set to NULL while this code is running, so there is a small chance of this code causing a segfault, if SVOTE_Debug() is running when the current transaction ends. We only ever initialize ubik_currentTrans as a write transation (via SDISK_Begin), so this check is pointless anyway. Accordingly, skip the type check, and always assume that any active transaction is a write transaction. This means we only ever access ubik_currentTrans once, avoiding any risk of the value changing between accesses (and we no longer need to dereference it, anyway). Note that, since ubik_currentTrans is not marked as 'volatile', some C compilers, with certain options, can and do assume that its value will not change between accesses, and thus only fetch the pointer value once. This avoids the risk of NULL dereference (and thus, crash, if pointer stores/loads are atomic), but the value pointed to by ubik_currentTrans->type would be incorrect when the transaction ends during the execution of SVOTE_Debug(). Change-Id: Ia36c58e5906f5e8df59936f845ae11e886e8ec38 Reviewed-on: https://gerrit.openafs.org/13915 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 93aee3cf40622993b95bd1af77080a31670c24bb Author: Andrew Deason Date: Wed Aug 7 21:19:47 2019 -0500 OPENAFS-SA-2019-002: Zero all server RPC args Currently, our server-side RPC argument-handling code generated from rxgen initializes complex arguments like so (for example, in _RXAFS_BulkStatus): AFSCBFids FidsArray; AFSBulkStats StatArray; AFSCBs CBArray; AFSVolSync Sync; FidsArray.AFSCBFids_val = 0; FidsArray.AFSCBFids_len = 0; CBArray.AFSCBs_val = 0; CBArray.AFSCBs_len = 0; StatArray.AFSBulkStats_val = 0; StatArray.AFSBulkStats_len = 0; This is done for any input or output arguments, but only for types we need to free afterwards (arrays, usually). We do not do this for simple types, like single flat structs. In the above example, we do this for the arrays FidsArray, StatArray, and CBArray, but 'Sync' is not initialized to anything. If some server RPC handlers never set a value for an output argument, this means we'll send uninitialized stack memory to our peer. Currently this can happen in, for example, MRXSTATS_RetrieveProcessRPCStats if 'rxi_monitor_processStats' is unset (specifically, the 'clock_sec' and 'clock_usec' arguments are never set when rx_enableProcessRPCStats() has not been called). To make sure we cannot send uninitialized data to our peer, change rxgen to instead 'memset(&arg, 0, sizeof(arg));' for every single parameter. Using memset in this way just makes this a little simpler inside rxgen, since all we need to do this is the name of the argument. With this commit, the rxgen-generated code for the above example now looks like this: AFSCBFids FidsArray; AFSBulkStats StatArray; AFSCBs CBArray; AFSVolSync Sync; memset(&FidsArray, 0, sizeof(FidsArray)); memset(&CBArray, 0, sizeof(CBArray)); memset(&StatArray, 0, sizeof(StatsArray)); memset(&Sync, 0, sizeof(Sync)); Change-Id: Iedccc25e50ee32bd1144e652b951496cb7dde5d2 Reviewed-on: https://gerrit.openafs.org/13914 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit ea276e83e37e5bd27285a3d639f2158639172786 Author: Andrew Deason Date: Wed Aug 7 20:50:47 2019 -0500 OPENAFS-SA-2019-001: Skip server OUT args on error Currently, part of our server-side RPC argument-handling code that's generated from rxgen looks like this (for example): z_result = SRXAFS_BulkStatus(z_call, &FidsArray, &StatArray, &CBArray, &Sync); z_xdrs->x_op = XDR_ENCODE; if ((!xdr_AFSBulkStats(z_xdrs, &StatArray)) || (!xdr_AFSCBs(z_xdrs, &CBArray)) || (!xdr_AFSVolSync(z_xdrs, &Sync))) z_result = RXGEN_SS_MARSHAL; fail: [...] return z_result; When the server routine for implementing the RPC results a non-zero value into z_result, the call will be aborted. However, before we abort the call, we still call the xdr_* routines with XDR_ENCODE for all of our output arguments. If the call has not already been aborted for other reasons, we'll serialize the output argument data into the Rx call. If we push more data than can fit in a single Rx packet for the call, then we'll also send that data to the client. Many server routines for implementing RPCs do not initialize the memory inside their output arguments during certain errors, and so the memory may be leaked to the peer. To avoid this, just jump to the 'fail' label when a nonzero 'z_result' is returned. This means we skip sending the output argument data to the peer, but we still free any argument data that needs freeing, and record the stats for the call (if needed). This makes the above example now look like this: z_result = SRXAFS_BulkStatus(z_call, &FidsArray, &StatArray, &CBArray, &Sync); if (z_result) goto fail; z_xdrs->x_op = XDR_ENCODE; if ((!xdr_AFSBulkStats(z_xdrs, &StatArray)) || (!xdr_AFSCBs(z_xdrs, &CBArray)) || (!xdr_AFSVolSync(z_xdrs, &Sync))) z_result = RXGEN_SS_MARSHAL; fail: [...] return z_result; Change-Id: I2bdea2e808bb215720492b0ba6ac1a88da61b954 Reviewed-on: https://gerrit.openafs.org/13913 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit a455452d7ee98d160620925bb8a0e3d0f4dfd7ec Author: Cheyenne Wills Date: Tue Oct 1 12:14:41 2019 -0600 LINUX 5.3: Add comments for fallthrough switch cases With commit 6e0f1c3b45102e7644d25cf34395ca980414317f (LINUX: Honor --enable-checking for libafs) building libafs against a linux 5.3 kernel compiles with errors due to fall through in case statements when --enable-checking / --enable-warning is used. e.g. src/opr/jhash.h:82:17: error: this statement may fall through [-Werror=implicit-fallthrough=] case 3 : c+=k[2]; ~^~~~~~ The GCC compiler will disable the implicit-fallthrough check for case statements that contain a "special" comment ( /* fall through */ ). Add the 'fall through' comment to indicate where fall throughs are acceptable. This commit only adds comments and does not alter any executable code. The -Wimplicit-fallthrough flag was enabled globally in the linux kernel build in 5.3-rc2 (commit: a035d552a93bb9ef6048733bb9f2a0dc857ff869 Makefile: Globally enable fall-through warning) Change-Id: Ie6ca425e04b53a22d07b415cb8afd172af7e8081 Reviewed-on: https://gerrit.openafs.org/13881 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 747afb94aa214217a749471679082c6ed8e81e92 Author: Marcio Barbosa Date: Thu Sep 20 08:44:59 2018 -0400 afs: avoid extra VL_GetEntryByName for .readonly's In the VLDB, there's only one logical entry for a volume and its associated clones; there are not separate entries for the RW volume "avol", the RO volume "avol.readonly", and the BK volume "avol.backup". And so, when looking up a volume in the VLDB by name, the vlserver ignores any trailing ".readonly" or ".backup" in the given name. More concretely, the result of calling VL_GetEntryByName*("avol") is identical to that from calling VL_GetEntryByName*("avol.readonly"). Accordingly, if afs_GetVolumeByName(name) failed because the volume was not found in the VLDB, afs_GetVolumeByName(name.readonly) will fail as well (barring a change in external circumstances, such as the volume being created or a network connection coming back up). Therefore, the extra call in EvalMountData() is not necessary and can be removed. Remove the extra call, to slightly improve the response time of the client if the volume in question does not exist, and to reduce vlserver load when patched clients are looking up nonexistent volumes. Change-Id: I4f2f668107281565ae72a563a263121bd9bb7e3c Reviewed-on: https://gerrit.openafs.org/13334 Tested-by: BuildBot Reviewed-by: Marcio Brito Barbosa Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 860cbec815d61db2d82870290652a3bc7471b8e3 Author: Michael Meffie Date: Tue Oct 1 16:16:16 2019 -0400 RedHat: package rxstat_* programs Install libadmin rxstat_* sample programs with 'make install'/'make dest'. Include these programs in the openafs rpm package. Change-Id: I81b965cf440c869072cce0065a3c74c4c699b8b8 Reviewed-on: https://gerrit.openafs.org/13883 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit b03f3e6101ff21a6f148c555c213c47678482a7b Author: Cheyenne Wills Date: Thu Oct 3 10:21:43 2019 -0600 RedHat: Update makesrpm.pl to use @PACKAGE_VERSION@ instead of @VERSION@ Commit 2f2c2ce62aa17ecac3651d64c1168af926f7458b 'Remove automake autoconf vars' replaced the automake variable @VERSION@ with the autoconf variable @PACKAGE_VERSION@. (Gerrit #13357) The RedHat openafs.spec.in is not processed using autoconf, but by 'makesrpm.pl', which was not updated to use @PACKAGE_VERSION@. Update makesprm.pl to use @PACKAGE_VERSION@ instead of @VERSION@ Change-Id: I74d1d61e40e660459942ec68cfdedfe569a6abeb Reviewed-on: https://gerrit.openafs.org/13887 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit d9fc4890f01a41fa5a63f97f2446b3afc35b473f Author: Andrew Deason Date: Thu Sep 26 13:35:51 2019 -0500 rx: Fix test for end of call queue for LWP Commit 6ad3d646 (rx: Correctly test for end of call queue) fixed a broken end-of-queue check in rx_GetCall, but it only fixed the RX_ENABLE_LOCKS version of rx_GetCall. The non-locks version (i.e. the LWP version) still had this bug. Fix it for the LWP case, to avoid some rare cases where an Rx call can get stuck in the incoming queue. Also remove the comment added by commit 170dbb3c (rx: Use opr queues), since we're fixing the mentioned problem. Change-Id: I5b96d97d9aba7bc4b383133b2136f949f3ed22bc Reviewed-on: https://gerrit.openafs.org/13880 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit aefc4c4f46e13f59b4cbe043e1a2a6f4ed99e076 Author: Mark Vitale Date: Tue Sep 17 15:14:44 2019 -0400 viced: consistently enforce host thread quota for ICBS(3) From time to time, the fileserver may issue potentially long-running RXAFSCB_* RPCs back to a host (client). If these are holding h_Lock_r (host->lock) while running, they may cause other service threads for the same host (client) to block. In order to prevent a given host from tying up too many service threads in this way, the fileserver enforces a quota limiting how many threads can be waiting for h_Lock_r on a particular host while waiting for one of the following RPCs to complete: - RXAFSCB_TellMeABoutYourself (TMAY) - RXAFSCB_WhoAreYou - RXAFSCB_ProbeUuid - RXAFSCB_InitCallBackState (ICBS) - RXAFSCB_InitCallBackState3 (ICBS3) Note: Although some of these RPCs are relatively lightweight, they may still experience network delays. This quota is enforced by calling h_threadquota() in h_Lookup_r and h_GetHost_r. The quota check is enabled for a given host by turning on host->hostFlags HWHO_INPROGRESS for the duration of the RXAFSCB_* RPC. The quota check is only needed, and should only be enabled, when the RPC is issued while h_Lock_r is held. However, there are a few paths to ICBS(3) where h_Lock_r is held but HWHO_INPROGRESS is not set. A delay in those paths may allow a host to consume an unlimited number of fileserver threads. One such path observed in a field report was SRXAFS_FetchStatus -> CallPreamble -> BreakDelayedCallBacks_r -> RXAFSCB_ICBS3. Instead, enable host thread quotas for all remaining unregulated ICBS(3) RPCs. Change-Id: I70b96055ff80d8650bdbaec0302b7d18a8f22d56 Reviewed-on: https://gerrit.openafs.org/13873 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit a133f1b1e7eb605c36ac16a6ed115bef03e8a004 Author: Cheyenne Wills Date: Tue Sep 24 15:59:47 2019 -0600 Retire the AFS_PTR_FMT macro Originally '%x' was commonly used as the printf specifier for formatting pointer values. Commit 37fc3b01445cd6446f09c476ea2db47fea544b7d introduced the AFS_PTR_FMT macro to support platform-dependent printf format specifiers for pointer representation. This macro defined the format specifier as '%p' for Windows, and '%x' for non-Windows platforms. Commit 2cf12c43c6a5822212f1d4e42dca7c059a1a9000 changed the printf pointer format specifier from '%x' to '%p' on non-Windows platforms as well, so at this point '%p' is the printf pointer format specifier for all supported platforms. Since the AFS_PRT_FMT macro is no longer platform-dependent, and all C89 compilers support the '%p' specifier, retire the macro to simplify the printf format strings. Change-Id: I0cb13cccbe6a8d0000edd162b623ddcdb74c1cf7 Reviewed-on: https://gerrit.openafs.org/13830 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot Reviewed-by: Michael Meffie commit 0f1e54c47c179bdbd69799170d9740e3e58e86db Author: Andrew Deason Date: Fri Aug 16 12:48:21 2019 -0500 Pass -shared when linking some shared libraries Currently, we use $(LT_LDLIB_shlib) to build most of our shared libraries. This invokes libtool, passing our various flags like PTH_LDFLAGS and PTH_CFLAGS (since all of our shared-library code is for pthreads). Notably, we do NOT pass the -shared flag; the -shared flag tells libtool to only build a shared library, and to not also build a static library (on systems where libtool supports building shared and static libraries simultaneously). Because of this, our LT_LDLIB_shlib invocations build both, which is reasonably correct for our per-module convenience libraries (that end up getting linked statically into the binaries that we install), but is not entirely correct for the public libraries that we install. Specifically, for ABI compatibility purposes, we must provide both shared and static libraries of the public libraries that we install, and since libtool on AIX does not build (or install) a static library at all with --mode-link unless -static is passed, we have separate rules to build the shared and static libraries for final installation. This can cause install errors with parallel make (on non-AIX systems), and possibly other errors, when we go to install the relevant library into TOP_LIBDIR. For example, in src/kopenafs, we have the following rules: ${TOP_LIBDIR}/libkopenafs.${SHLIB_SUFFIX}: libkopenafs.la ${LT_INSTALL_DATA} libkopenafs.la ${TOP_LIBDIR}/libkopenafs.la ${RM} ${TOP_LIBDIR}/libkopenafs.la ${TOP_LIBDIR}/libkopenafs.a: libkopenafs.a ${INSTALL_DATA} libkopenafs.a $@ The rule to install libkopenafs.so will invoke libtool to do the install, which will install libkopenafs.so, libkopenafs.so.X.Y, and libkopenafs.a (from .libs/libkopenafs.a, not the libkopenafs.a we built separately). If we are running the rule to install libkopenafs.a in parallel, it may fail with an error like so: /usr/bin/install -c -m 644 libkopenafs.a /home/buildbot/openafs/fedora26-x86_64/build/lib/libkopenafs.a /usr/bin/install: cannot create regular file '/home/buildbot/openafs/fedora26-x86_64/build/lib/libkopenafs.a': File exists make[3]: *** [Makefile:35: /home/buildbot/openafs/fedora26-x86_64/build/lib/libkopenafs.a] Error 1 Even without that error, this confusion means that the libkopenafs.a installed into TOP_LIBDIR may be the one from src/kopenafs/libkopenafs.a, or the one from libtool's src/kopenafs/.libs/libkopenafs.a; it depends on what order the rules are run. If those libraries are different, that could potentially cause all sorts of other problems. To avoid this, we can pass -shared to libtool when building our shared libraries. We used to pass -shared when building shared libraries, since -shared is almost always one our SHLIB_LDFLAGS set in src/osconf.m4. However, ever since commit 2c3a517e (Retire Makefile.shared), SHD_CFLAGS, SHD_LDFLAGS, and SHD_CCRULE have all been unused, and SHD_LDFLAGS was the only place where we used SHLIB_LDFLAGS. As a result, we never use SHLIB_LDFLAGS anywhere, and so we never pass -shared to anything. However, we cannot pass -shared to libtool when building all of our shared libraries, since we do need the static library for our per-module convenience libraries. For example, liboafs_rx.la has no separately-built static library (librx.a is for LWP, liboafs_rx.{so,a} is for pthreads), but liboafs_rx needs to be linked statically into all of our command-line tools. So to fix this, introduce a new linking rule, called LT_LDLIB_shlib_only, which causes the given library to be built only as a shared library (by giving -shared to libtool), and not as a static library. Update the build rules to use this new linking rule for the libraries that need it, and leave the others alone. Since the only use of LT_LDLIB_shlib_missing is also used for a public library (afshcrypto), also pass -shared in that rule. Also remove SHD_* and SHLIB_LDFLAGS variables, since they are unused. Change-Id: Ia9e040afa3819f1ff70d050a400fecb9624bb9ba Reviewed-on: https://gerrit.openafs.org/13786 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 1de602aaada15df1008140784092c2a76a2613a1 Author: Yadavendra Yadav Date: Wed Aug 28 17:26:41 2019 +0530 aklog: avoid infinite lifetime tokens by default Currently we get tokens for infinite lifetime using aklog impersonate feature. Based on inputs from Ben, this was done for server to server tickets to be valid forever. However on 1.8.x we have other mechanisms that were usable for server-to-server authentication with strong enctypes, so we do not need to provide user level akimpersonate to generate tokens for infinite lifetime. For this we have added new option -token-lifetime , this can take values from 0 to 720 hours. If 0 is specified it means tokens will have infinite lifetime. By default 10 hours will be token lifetime for akimpersonate tokens. Change-Id: I8190be81771b34682cc000ac051888561dc63c2f Reviewed-on: https://gerrit.openafs.org/13828 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit dc99144da54d12e8a168c3dfb0255e2a40ba321f Author: Mark Vitale Date: Wed Jul 17 22:07:45 2019 -0400 rx: add missing CLEAR_CALL_QUEUE_LOCK to LWP rx_GetCall In all other places where we remove an rx_call from a queue, we also CLEAR_CALL_QUEUE_LOCK. This isn't necessary in the LWP (non-RX_ENABLE_LOCKS) version of rx_GetCall because rx_call does not have member call_queue_lock for LWP. However, for the sake of consistency for future maintainers, add a CLEAR_CALL_QUEUE_LOCK here as well; it is a no-op for LWP. No functional change is incurred by this commit. Change-Id: Ibbb005fa15dd517fc5282574d0d4abd74e937e02 Reviewed-on: https://gerrit.openafs.org/13695 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit fe6798d0d9e4df006ef96612b5c6e07fcc757b7e Author: Mark Vitale Date: Mon Sep 16 01:37:33 2019 -0400 SOLARIS: add autoconfig support for Studio 12.6 Add the canonical install path for Studio 12.6 to the autoconfig test. Change-Id: Id90ae1816845ed8aaa80be7b3d57846059084339 Reviewed-on: https://gerrit.openafs.org/13867 Tested-by: Mark Vitale Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit e87c40f4546ee9c31b2eaad2a24be9fb9a0b25b1 Author: Mark Vitale Date: Thu Mar 14 23:15:29 2019 -0400 rx: clear call_queue_lock after removing call from queue The call_queue_lock is set to either rx_serverPool_lock or rx_freeCallQueue_lock, depending on whether an rx_call resides in the rx_incomingCallQueue or the rx_freeCallQueue, respectively. This value is used by rxi_ResetCall to lock the appropriate queue before removing a call. Therefore, the call_queue_lock should be cleared after a call is removed from a queue. This issue has no known external symptoms; however, repairing this is helpful to developers examining core files. Repair two instances where the call_queue_lock is not cleared. Change-Id: Id1d9ac8454c1e07c10766dffb2a2beac7122bf3e Reviewed-on: https://gerrit.openafs.org/13641 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 3be5880d1d2a0aef6600047ed43d602949cd5f4d Author: Andrew Deason Date: Mon Jul 8 14:49:23 2019 -0500 afs: Avoid panics in afs_InvalidateAllSegments Currently, afs_InvalidateAllSegments panics when afs_GetValidDSlot fails. We panic in these cases because afs_InvalidateAllSegments cannot simply return an error to its callers; we must invalidate all segments for the given vcache, or we risk serving incorrect data to userspace as explained in the comments. Instead of panicing, though, we could simply sleep and retry the operation until it succeeds. Implement this, retrying every 10 seconds, and logging a message every hour that we're stuck (in case we're stuck for a long time). When we retry the operation, do so in a background request, to avoid a somewhat common situation on Linux where we always get I/O errors from the cache when the calling process has a SIGKILL pending. Create a new background op for this, BOP_INVALIDATE_SEGMENTS. With this, the relevant vcache will be effectively unusable for the entire time we're stuck in this situation (avc->lock will be write-locked), but this is at least better than panicing the whole machine. Change-Id: Icdc58a94f0cd5857903836d94e5cf7814ce7e088 Reviewed-on: https://gerrit.openafs.org/13677 Reviewed-by: Benjamin Kaduk Reviewed-by: Michael Meffie Reviewed-by: Cheyenne Wills Tested-by: BuildBot commit 1c4e94da2a8fce9d79006ad6d6673d3d7de117d3 Author: Benjamin Kaduk Date: Fri Aug 9 07:59:44 2019 -0700 The interminable rework of afs_random() Commit f0a3d477d6109697645cfdcc17617b502349d91b restructured the operation on tv_usec to avoid using undefined behavior, but in the process introduced a behavior change. Historically (at least as far back as AFS-3.3), we masked off the low nybble (four bits) of tv_usec before adding the low byte (eight bits) of the rxi_getaddr() output. Why there was a desire to combine two sources of input for the overlapping four bits remains unclear, but restore the historical behavior for now, as the intent of commit f0a3d477d6109697645cfdcc17617b502349d91b was to not introduce any behavior changes. Change-Id: Icb8bc1edd34ca29c3094b976436177b18bfc8d1d Reviewed-on: https://gerrit.openafs.org/13759 Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 276bd5c7f8a2ec7673d2ad084566203eb2055938 Author: Yadavendra Yadav Date: Wed Aug 28 17:04:31 2019 +0530 aklog: use any enctype in get_credv5 We currently always pass DES as the requested enctype to get_credv5_akimpersonate, but this means we will fail to use our service princ if we're using another enctype (say, AES) with rxkad-k5. To allow this to work with any enctype, just don't pass any requested enctypes, and just use the enctype inside the 'entry' returned to us from krb5_kt_get_entry. Remove all of the logic associated with the now-unused "allowed_enctypes" argument. Also remove the logic handling the case where "service_principal" is NULL (since no callers pass a NULL service_principal), to make it easier to take out the allowed_enctypes related code. Change-Id: Id11514ead26e15a287791c40509a001a1861df97 Reviewed-on: https://gerrit.openafs.org/13827 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 7a13bce2513baf5a3a61db94f3d88232241cea5b Author: Yadavendra Yadav Date: Wed Aug 28 16:43:35 2019 +0530 aklog: retry getting tokens for KRB5_KT_NOTFOUND error If we're creating tokens with -keytab and our AFS service principal is afs@, we'll first try creating tokens with afs/@ and krb5_kt_get_entry will fail with KRB5_KT_NOTFOUND. Since we do not retry for KRB5_KT_NOTFOUND error, we will not get tokens. So in order to get tokens for principal afs@ we should retry for KRB5_KT_NOTFOUND error. Thanks to jpjanosi@us.ibm.com for finding this issue and suggesting a fix. Change-Id: I8af9df9876973badc4631f509eebcda46d667cef Reviewed-on: https://gerrit.openafs.org/13826 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 2a33a80f7026df6b5e47e42319c55d8b7155675a Author: Andrew Deason Date: Sun Jul 21 18:31:53 2019 -0500 rx: Introduce rxi_NetSend Introduce a small wrapper around osi_NetSend, called rxi_NetSend. This small wrapper allows future commits to change the code around our osi_NetSend calls, without needing to change every single call site, or every implementation of osi_NetSend. Change most call sites to use rxi_NetSend, instead of osi_NetSend. Do not change a few callers in the platform-specific kernel shutdown sequence, since those call osi_NetSend for platform-specific reasons. This commit on its own does not change any behavior with osi_NetSend; it is just code reorganization. Change-Id: I0a7eb39d85d4e542c2832bb40191ab49fb02d067 Reviewed-on: https://gerrit.openafs.org/13717 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 6559297610de0f71c9050f3582d4d146e0cc1f3c Author: Yadavendra Yadav Date: Wed Aug 28 16:25:49 2019 +0530 aklog: Use HAVE_ENCODE_KRB5_ENC_TKT_PART for aklog impersonate In get_credv5_akimpersonate we use HAVE_ENCODE_KRB5_ENC_TKT which is not defined, due to this we always return -1 from this routine for non Heimdal case. We have a another define i.e HAVE_ENCODE_KRB5_ENC_TKT_PART which is defined if encode_krb5_enc_tkt_part function is present. In current code encode_krb5_enc_tkt_part is called from krb5_encrypt_tkt_part and krb5_encrypt_tkt_part is called from get_credv5_akimpersonate for non Heimdal case. So we should change HAVE_ENCODE_KRB5_ENC_TKT to HAVE_ENCODE_KRB5_ENC_TKT_PART. Also while we're here, add a declaration for the internal function encode_krb5_ticket, so we can build this newly-enabled code without warnings. Change-Id: I8f740e319ad279e284efaa407e6f92d0dc7a1bf6 Reviewed-on: https://gerrit.openafs.org/13825 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot Reviewed-by: Andrew Deason commit d1e90b82ebb2685cbac3ecb3fd99136328b35357 Author: Stephan Wiesand Date: Fri Sep 6 13:35:02 2019 +0200 ptserver: Increase length limit of namelist, idlist, prlist, prentries An implementation limit of those lists was introduced in commit a0ffea098d8c5c5b46c6bf86a12d28d6e7096685 to prevent using unlimited amounts of memory in ptserver and the client. Subsequent reports indicate that the chosen limits are small enough to restrict functionality currently in use at some sites where membership lists exceed the current limit. Since this is just an implementation- defined limit and can freely change from release to release, increase the threshold by an order of magnitude to preserve functionality for existing deployments while still retaining some protection against attacker-controlled excessive memory allocation. Change-Id: I857bb3b697909668eb71224b631dfbb7e3c03d3c Reviewed-on: https://gerrit.openafs.org/13838 Reviewed-by: Michael Meffie Reviewed-by: Andrew Deason Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 54150f381de34d2a0c85ab15cf25801effd0c154 Author: Andrew Deason Date: Fri Aug 9 22:36:17 2019 -0500 LINUX: Check for -Wno-error=frame-larger-than= Commit cc7f942a (LINUX: Disable kernel fortuna large frame errors) added -Wno-error=frame-larger-than= to the CFLAGS for a file, but older gcc (like 4.3.4 from SLES 11.x) does not support this flag, causing a compiler error. To avoid this, add a configure check for -Wno-error=frame-larger-than=, and only use it if the compiler supports it. Thanks to mvitale@sinenomine.net for discovering the error. Change-Id: I5486d2d4711f2c301be1cb79f0aaad69a22e9d3a Reviewed-on: https://gerrit.openafs.org/13762 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit ddf7d2a7f4bfdcab238e791cb8c49bb803e76b09 Author: Cheyenne Wills Date: Fri Aug 9 13:25:26 2019 -0600 vlserver: initialize nvlentry elements after read Commit 7620bd33487207b348ed7aeba45f8d743132ba84 (vlserver: fix vlentryread() for old vldb formats) leaves the tail end of the serverNumber, serverParition and serverFlags arrays uninitialized since it only copies OMAXNSERVERS elements into arrays that have NMAXNSERVERS elements. Initialize the elements in the nvlentry server arrays that were not copied with BADSERVERID. Change-Id: I9533e3a40922c76d4179e0ada393103c2aa533dd Reviewed-on: https://gerrit.openafs.org/13755 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 83d9a86fb1af519a92ffc0d8f6d73cddded8f6f5 Author: Andrew Deason Date: Mon Aug 26 22:03:23 2019 -0500 opr: Include procmgmt_softsig.h for WINNT On WINNT, procmgmt_softsig.h exists to implement our opr softsig routines in terms of procmgmt routines. Any time we include opr/softsig.h in cross-platform code, we currently must also include afs/procmgmt_softsig.h so we can build on WINNT. We currently do not do this in src/xstat, causing build failures on WINNT. To avoid this, just make opr/softsig.h include procmgmt_softsig.h itself, so all of the opr/softsig.h users don't have to remember to do this. Link xstat_*_test against procmgmt, so linking will succeed for those tools. Change-Id: I2dc8226d438be25cdccbe96474220d7c81ae25b9 Reviewed-on: https://gerrit.openafs.org/13824 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit ab8b28540ef17d67db02d5dbcb7585443c164e45 Author: Yadavendra Yadav Date: Sat Aug 10 02:54:38 2019 +0530 aklog: Free client/server princs in get_credv5 Inside get_credv5, client_principal is static so the first time get_credv5 runs we'll allocate memory for it, and on subsequent calls we'll reuse the same value. However, if we call get_credv5_akimpersonate, we'll free client_principal and never change what client_principal points to. If we need to call get_credv5 again (because we need to retry getting creds), we'll reuse the old value for client_principal, but since it points to free memory we'll segfault or cause other problems. To avoid this, change get_credv5 so we allocate the client and server principals on each invocation of get_credv5 and free them before returning from get_credv5. Since we free the client and server principals inside get_credv5, remove freeing the client and server principals inside get_credv5_akimpersonate. Change-Id: Ie263aa2c03efc75e818d9007347dca9e42380dd4 Reviewed-on: https://gerrit.openafs.org/13761 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 2336164d1bf63980419d3a870f908f1f384fdfc0 Author: Andrew Deason Date: Sun Jul 21 17:02:34 2019 -0500 afs: Actually free resources during warm shutdown Currently, the shutdown_*() code paths for several subsystems only free the memory for that subsystem for "cold" shutdowns, and not for "warm" shutdowns. This means the memory gets leaked during a "warm" shutdown, since we never free these resources anywhere else. Specifically, this happens in shutdown_bufferpackage, shutdown_AFS, and shutdown_osinet. To avoid these leaks for warm shutdowns, just move the afs_cold_shutdown check around a little, so we free the relevant items in either codepath. Change-Id: I748311784f512b3e2f25bdcaa6629108a5790212 Reviewed-on: https://gerrit.openafs.org/13716 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 130a92214cc0b9a8f4ea24a3dcd3ed04575e3c4e Author: Yadavendra Yadav Date: Sat Aug 10 02:41:01 2019 +0530 aklog: free kbr5_creds before returning from rxkad_get_token rxkad_get_ticket allocates 'v5cred' which should be freed when we return from rxkad_get_token. Change-Id: I09b20781f0856ab8e230e0af271e9d0c58fee90c Reviewed-on: https://gerrit.openafs.org/13760 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit fbdf126df02eacc0442d80cc5bca0e16ddafe55e Author: Andrew Deason Date: Sun Aug 25 19:30:30 2019 -0500 rx: Convert rx_FreeSQEList to rx_freeServerQueue Currently, rx_serverQueueEntry structs are placed on the rx_FreeSQEList linked list instead of being freed directly, but managing this list is done a bit oddly. The first field in struct rx_FreeSQEList is an opr_queue, but we don't use the opr_queue_* macros to manage the list. Instead, we just assume the first field in a struct rx_serverQueueEntry is a pointer that we can use to link entries together. This is currently true and works, but it's an odd way of maintaining such a list, and of course would break if we ever moved the fields around in struct rx_serverQueueEntry. Make this code more closely follow the normal way of managing opr_queue lists, by using opr_queue_* macros, and changing rx_FreeSQEList to be an opr_queue itself. Change the name to rx_freeServerQueue to ensure all callers are changed, and to match the naming convention for the other linked lists for rx_serverQueueEntry structs. Also move rx_freeServerQueue and its associated lock freeSQEList_lock to be declared static inside rx.c, since neither are referenced outside of rx.c. The general idea for this commit suggested by kaduk@mit.edu. Change-Id: I2ea15af1ad3228fa5fdf9f323e9394838fba4bac Reviewed-on: https://gerrit.openafs.org/13811 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 3bc03e7a5f8ef521e71a30cb8e66e07e2d1b4605 Author: Andrew Deason Date: Sun Jun 23 17:48:53 2019 -0500 libafs: Create debug KMODDIR for FBSD debug inst Commit 99418024 (libafs: Create $(DESTDIR)$(KMODDIR) on FBSD inst) made it so we create the kmod installation dir before copying our module into it. However, if we build a 'debug' variant of our module, the FreeBSD build process also installs debug symbols in a different directory, ${DESTDIR}${KERN_DEBUGDIR}${KMODDIR}, which may not exist. So do the same thing for that dir too, if --enable-debug-kernel is turned on, so the build still works. To do this, introduce the LIBAFS_REQ_DIRS var, to make it easier to keep track of which dirs we may need to create. Change-Id: Id1ad72f6c19d5949d38ee97334b4014ae6ef16ad Reviewed-on: https://gerrit.openafs.org/13690 Reviewed-by: Benjamin Kaduk Tested-by: Andrew Deason commit f9e413eaa280377b7dca0214fe79668459035098 Author: Andrew Deason Date: Mon Aug 26 21:17:30 2019 -0500 xstat: Define AFS_PTHREAD_ENV on WINNT Commit 6b67cac4 (convert xstat and friends to pthreads) converted the xstat utilities to pthreads, but we still need to explicitly pass AFS_PTHREAD_ENV on WINNT to enable various pthread-specific code paths. So give -DAFS_PTHREAD_ENV for our objects in this dir. Change-Id: I222b99399a5fad3df528be2bc31823eb8bc52c62 Reviewed-on: https://gerrit.openafs.org/13823 Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 7a76f4dc00984d42b0535a8edbedee034ada896f Author: Andrew Deason Date: Mon Aug 26 20:33:58 2019 -0500 WINNT: Link tbutc against mtafsutil.lib tbutc uses pthreads, not LWP, so link it against mtafsutil.lib (a pthread library), and not afsutil.lib (an LWP library). Change-Id: Id29888d88bfdd9585e017217a9951eb645c65336 Reviewed-on: https://gerrit.openafs.org/13822 Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit c3716b3d7e32f47b084657e163b029e9f1756fa4 Author: Andrew Deason Date: Mon Aug 26 19:34:19 2019 -0500 rx: Export rx_GetCallStatus Commit 59d3a8b8 (vos: restore status information to 'vos status') added the function rx_GetCallStatus to Rx, and used it in the volserver, but didn't add the function to our .sym and .exp files, causing a linker error on at least WINNT. Add the function to the relevant .sym/.exp files, so we can link on all platforms. Change-Id: I859ac6d04d8a21eb6f8b4ba3f3720ca318e91334 Reviewed-on: https://gerrit.openafs.org/13820 Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 90117793ca3000a20cb3bff8601e9f8ae56fb5db Author: Andrew Deason Date: Mon Aug 26 18:46:21 2019 -0500 WINNT: Do not link ptclient.obj in libafsauthent ptclient.c contains a stub definition for osi_audit, but audit.c already contains a real definition for osi_audit. libafsauthent doesn't seem to actually need anything from ptclient (and the Unix libafsauthent doesn't appear to use it), so just don't include ptclient when linking libafsauthent. Change-Id: I4172b80138e5ea121fc3ae2689cf4ed23c81e35b Reviewed-on: https://gerrit.openafs.org/13819 Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit e4b689e8c7cb39b72854dd38b6a92134591c8bca Author: Andrew Deason Date: Mon Aug 26 18:14:48 2019 -0500 WINNT: Link butc against audit Since commit c43169fd (OPENAFS-SA-2018-001 Add auditing to butc server RPC implementations), butc references symbols from audit. So add audit to our libraries to link against, so we can link butc on WINNT. Change-Id: I65f4d87085a8917c9b11d7c27b8e3902cd2a1c1c Reviewed-on: https://gerrit.openafs.org/13818 Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit f895a9b51671ffdc920fd9b4284337c5b737a0ef Author: Andrew Deason Date: Mon Aug 26 17:40:56 2019 -0500 WINNT: Make opr_threadname_set a no-op We don't supply an implementation for opr_threadname_set for WINNT; don't pretend that we do. Change-Id: Ifa8042253d0aa10f365356d93cea3fad4686371a Reviewed-on: https://gerrit.openafs.org/13817 Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 75a5c1b06e44bb6207cee7bd653cda688869aade Author: Andrew Deason Date: Mon Aug 26 16:54:55 2019 -0500 rxkad: Improve ticket5 import from Heimdal The current method of importing our ticket5 code from Heimdal has a few issues: - The der-protos.h file we generate contains numerous function prototype declarations that looks like this: ret-type func(parm-list, type */* comment */); which cause numerous warnings on WINNT, because the '*/*' sequence looks like the end of a nonexistent comment. This was previously fixed manually in commit 8b5d3a73 (rxkad: remove warnings from der-protos.h), but each time we regenerated our ticket5 code, the same thing would happen. - We manually insert an include for "asn1_err.h" in our v5der.c, and the v5gen.c we pull in has an include for inside it. During a WINNT build, these can pull in different asn1_err.h files (one from us, and one from the "Heimdal compatibility layer SDK" or anything else in our include paths). Since the asn1_err.h in our tree doesn't have an include guard, the code for both gets included, which can cause various problems. - Our current asn1_err.h file that we include is ultimately generated by the awk-based compile_et from e2fsprogs, not the C-based compile_et from Heimdal. This likely happened by accident because the Heimdal build system uses the system compile_et by default. This flavor of compile_et generates arguably inferior comerr-based header files (they lack include guards, and they use #define constants instead of enums). Fix these issues with some edits to our README.v5 script: - Apply a simple sed filter when we pull in der-protos.h to change '*/*' into '* /*', to remove the relevant warnings. - Instead of inserting an include for asn1_err.h into v5der.c in our import script, just put it in ticket5.c, making it easier to see and edit. Change this to so it uses the same asn1_err.h as in v5gen.c. - Add a note to run the Heimdal build with COMPILE_ET=no, so the Heimdal build system uses the in-tree compile_et, instead of whatever is on the relevant system. With these changes, redo the Heimdal import from the same version of the Heimdal codebase. Change-Id: I01e06f2799f1c828b8224c3425079b313ffb5b6b Reviewed-on: https://gerrit.openafs.org/13816 Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit b9b5385e6a04dcacd180f33e39495c7909fe4df3 Author: Andrew Deason Date: Mon Aug 26 16:08:31 2019 -0500 kauth: Move COUNT_REQ to beginning of block Commit b604ee7a (OPENAFS-SA-2018-002 kaserver: prevent KAM_ListEntry information leak) added a memset in kamListEntry before COUNT_REQ, but COUNT_REQ declares a local variable. This breaks the WINNT build, because we must declare variables at the beginning of a block. To fix this, just swap the two lines. Change-Id: I47eb61e6f95c2e38c619e90c8f093de325892c63 Reviewed-on: https://gerrit.openafs.org/13815 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 1534302d4489d2ba1c421077cdedb0187a2c1722 Author: Andrew Deason Date: Mon Aug 26 14:34:45 2019 -0500 rxgk: Add NTMakefile to install headers Commit 83eec909 (Implement afsconf_GetRXGKKey) added a reference to rx/rxgk_types.h inside cellconfig.p.h. Nothing ever added src/rxgk WINNT makefiles, so that include file is never installed into place, breaking the WINNT build when code tries to include cellconfig.h. To fix this and other code that needs rxgk header files, create an NTMakefile for src/rxgk, which just exists to install headers into place. Call it from the top-level NTMakefile right before copying in the auth headers. Change-Id: Id111479f55b4c330640e80d167a8af664fe3622e Reviewed-on: https://gerrit.openafs.org/13814 Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 2df2de06e5df64f5666316b14d67de7e7c5dae70 Author: Andrew Deason Date: Sun Jul 21 21:15:11 2019 -0500 rx: Avoid leaking 'sq' in libafs rx_GetCall Currently, in rx_GetCall when building for the kernel, if we notice that we're shutting down (that is, if afs_termState has reached AFSOP_STOP_RXCALLBACK), we return immediately. However, 'sq' may have been allocated much earlier in this function, and if we return here, we never free 'sq' or set it on any list. Returning immediately is also unnecessary here; if we just 'break' out of our wait loop, 'call' will still be NULL, and we'll break out of the outer loop, and go through the rest of the function like normal. The only difference is, if we 'break' instead of 'return'ing, we'll put 'sq' on the free list before returning. So, just 'break' out of the loop instead of returning, so we put 'sq' on the free list and avoid leaking its memory. Change-Id: Ibb2f4e697a586392f76ccdbbefdae8d75740f6fe Reviewed-on: https://gerrit.openafs.org/13715 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 9eeb3ec09f5421ceab2be415a193bb3a3c44925f Author: Andrew Deason Date: Mon Aug 26 13:13:28 2019 -0500 WINNT: Build bubasics before audit Commit 9ebff4c6 (OPENAFS-SA-2018-001 audit: support butc types) made src/audit require the butc.h header, and updated Makefile.in to reflect this. However, this dir is also built on WINNT, and the NTMakefile was not updated to reflect this dependency. As a result, we might fail to build src/audit on WINNT, since butc.h may not exist yet, and we get an error like: cl [...] /c audit.c audit.c cl : Command line warning D9025 : overriding '/W4' with '/W3' audit.c(27) : fatal error C1083: Cannot open include file: 'afs/butc.h': No such file or directory NMAKE : fatal error U1077: 'C:\PROGRA~2\MICROS~1.0\VC\bin\amd64\cl.EXE' : return code '0x2' To fix this, move 'bubasics' to be made before 'audit' in NTMakefile, so butc.h is available when we build 'audit'. Change-Id: I2053db7cd95353cf6b703b4033239810338890aa Reviewed-on: https://gerrit.openafs.org/13813 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 8bb9ae944ec7e101b6c8133fdb867c847164b5a7 Author: Andrew Deason Date: Wed Aug 21 12:04:45 2019 -0500 afs: Introduce afs_FreeFirstToken Change afs_FreeOneToken to unlink the given token from its container, instead of requiring its caller to do so. Rename the function to afs_FreeFirstToken, to help indicate the change in behavior. Also, while we are changing afs_FreeTokens to accommodate this change, simplify afs_FreeTokens a little, making it resemble afs_DiscardExpiredTokens a bit more. [kaduk@mit.edu: add note about dead store elimination] Change-Id: I0cf9d8b94236c736001a38cccfa7fdfff9f3e609 Reviewed-on: https://gerrit.openafs.org/13807 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 0a39efee224e8d4431ae79281ca353a7ba6fdce4 Author: Andrew Deason Date: Sun Jul 14 17:31:30 2019 -0500 FBSD: Use ucontext for FreeBSD 10+ on amd64 Currently, running any LWP program on recent FreeBSD on amd64 causes (or can cause) a SIGBUS very quickly. This is possibly because our stack management code in LWP only ensures our stacks are 4 or 8-byte aligned in most cases (except DARWIN, which gets 16-byte-aligned stacks), according to the value of STACK_ALIGN. The amd64 ABI mandates that stacks be 16-byte-aligned, and some function calls assume that this is followed, causing a SIGBUS when it is not. FreeBSD on amd64 currently uses process.amd64.s for its savecontext() implementation, which does not do any checking or fixup of the stack alignment. This behavior has been observed on amd64 with FreeBSD 11 specifically, but it probably happens on any FreeBSD release when using clang. FreeBSD switched to clang as the default compiler with FreeBSD 10, so this probably occurs with FreeBSD 10 and newer. We could perhaps try to fix this by changing our stack management code, but we can also avoid most of this nonsense by just using ucontext instead of our custom assembly code. So, do that, by setting USE_UCONTEXT for FreeBSD 10+. Also enable the same 'stackvar'-based workaround in savecontext() as Linux uses, since otherwise 'topstack' appears to always be NULL, and triggers our stack overflow checks. Note that while LWP use is deprecated, as of this commit many small utilities (like 'fs') are still linked to LWP, and so are unusable without a fix like this. Change-Id: Ie8e928bd71e7f6e9c0fb1379259c55527b6ccdf3 Reviewed-on: https://gerrit.openafs.org/13691 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 8f9c92a888df7b2fd61a3e84aaf1d2c96a8b10dd Author: Andrew Deason Date: Sun Jul 28 15:03:43 2019 -0500 FBSD: Set KERNBUILDDIR for --with-bsd-kernel-build Currently, specifying --with-bsd-kernel-build during configure causes us to set BSD_KERNEL_BUILD, which sets KBLD in MakefileProto.FBSD.in, but nothing ever uses KBLD. This means that when we use --with-bsd-kernel-build, we don't actually build against the configuration for that kernel, which can result in a libafs.ko that cannot be loaded or causes other errors. Specifically, if trying to build for a VIMAGE kernel, the kernel complains when trying to load libafs: [...] kernel: link_elf_obj: symbol in_ifaddrhead undefined [...] kernel: linker_load_file: Unsupported file type The FreeBSD module build system looks for KERNBUILDDIR for an alternative build, which it uses to pull in opt_global.h and other required pieces from the build tree. So just specify KERNBUILDDIR if we have one. At the same time, avoid setting our default value for BSD_KERNEL_BUILD for FBSD when the calculated dir doesn't exist. At least for the default GENERIC kernel on FreeBSD 11.2-RELEASE, there may not be a build dir on the running machine, and so setting BSD_KERNEL_BUILD to the calculated value causes the build to fail when it doesn't exist. Change-Id: Ib3079354f9f6dba13970de5308bbcecaf9b35059 Reviewed-on: https://gerrit.openafs.org/13746 Tested-by: BuildBot Reviewed-by: Tim Creech Reviewed-by: Benjamin Kaduk commit 1effc3517fdb4b4653d47c59bf67076567209324 Author: Tim Creech Date: Sun Mar 5 18:18:01 2017 -0500 FBSD: Call CURVNET_SET/CURVNET_RESTORE for VIMAGE In commit 9703b023 (FBSD: VIMAGE support), we changed a couple of our variable references to their V_* equivalents, to accommodate kernels with VIMAGE turned on. This allows us to build, but causes us to crash whenever we hit that code when VIMAGE is enabled, because the relevant macros reference 'curvnet', which is NULL outside of networking code. What we're supposed to do is to set 'curvnet' before entering networking code by calling 'CURVNET_SET(xxx)', and reset it afterwards by calling 'CURVNET_RESTORE()'. We must make exactly one _RESTORE call for each _SET, and they are supposed to be run at the same level of scope. So to avoid the crashes, make the relevant CURVNET_* calls whenever we look at networking info. We currently only do this in a few places: - In afs_SetServerPrefs, to try to detect if a given server address is in the same network as one our local interfaces (V_in_ifaddrhead) - In rxi_GetIFInfo, for some MTU-related info (V_ifnet) - In rxi_FindIfnet, for some MTU-related info (ifa_ifwithnet) As for what vnet we actually set 'curvnet' to, we could set it to the vnet of the current thread (TD_TO_VNET(curthread)), or we could set it to the vnet of an associated network object (a socket, an interface, etc). Since all of our network-related code goes through Rx, in this commit we set curvnet to the vnet of the Rx socket (rx_socket->so_vnet). Note that VIMAGE is optional in 11-RELEASE, but is turned on by default in 12.0-RELEASE. For more information, see: https://wiki.freebsd.org/VIMAGE/porting-to-vimage [adeason@dson.org: Reworded commit message; moved some code around.] Change-Id: If631b8942d7ee5cfe38a8f0c32b282d015f0bf35 Reviewed-on: https://gerrit.openafs.org/12580 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 1d2a1002bd1bc8d82c05399c06836ede83f9eeea Author: Andrew Deason Date: Wed Aug 21 11:48:53 2019 -0500 afs: Update style in afs_tokens.c Fix a few style nits and other minor edits in afs_tokens.c. Mark a few functions 'static' that are not referenced outside of that file. Change-Id: Icdae1adb8282f96c7ccc6d4d053216b360adc38e Reviewed-on: https://gerrit.openafs.org/13806 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit c6eb9375ffa081329d69b9a36b40b8edb199990a Author: Andrew Deason Date: Wed Aug 21 12:37:06 2019 -0500 rx: Update style in rx_opaque.c Fix a few style nits in rx_opaque.c Change-Id: Ia03ba3f95911b791c63b3a07f2ab887063da36a7 Reviewed-on: https://gerrit.openafs.org/13805 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 339167ef1fda899655969f4572ff95271dfdb7cf Author: Andrew Deason Date: Wed Jul 10 15:14:28 2019 -0500 Remove dead code There is a perhaps-surprisingly large amount of code disabled behind directives like '#if 0', '#ifdef notdef', and '#ifdef notyet'. At best, this code is clutter, and at worst some of it is confusing/outdated, and/or confusingly nested inside other preprocessor conditionals. Sometimes this disabled code shows up when grepping the tree, and causes a nuisance when refactoring related areas of code. Get rid of all of it. If anyone ever wants this code back, it can always be restored by reverting portions of this commit. Also delete some comments that clearly refer to the disabled code, and in some cases, adjust the adjacent comments to make sense accordingly. This commit doesn't touch any files in src/external/. Change-Id: If260a41257e8d107930bd3c177eddb8ab336f0d1 Reviewed-on: https://gerrit.openafs.org/13683 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 0d6a43e0699fca00bff87c5e16c901e4579d2285 Author: Benjamin Kaduk Date: Sat Apr 12 17:24:04 2014 -0400 Remove a couple more uses of libafsauthent.a Change-Id: Ic49d2f44293c1fbe909b61d7f4c9ac7d5a3636bb Reviewed-on: https://gerrit.openafs.org/11095 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 1b0bb8a7fcbd69d513ed30bb76fd0693d1bd3319 Author: Andrew Deason Date: Thu Jul 18 22:56:48 2019 -0500 LINUX: Make sysctl definitions more concise Our sysctl definitions are quite verbose, and adding new ones involves copying a bunch of lines. Make these a little easier to specify, by defining some new preprocessor macros. Change-Id: I45fc8122b18587f42f52b3d41a1f4c6937ec0f8a Reviewed-on: https://gerrit.openafs.org/13700 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 6e0f1c3b45102e7644d25cf34395ca980414317f Author: Andrew Deason Date: Wed Jul 10 12:42:54 2019 -0500 LINUX: Honor --enable-checking for libafs When we build the kernel module on LINUX, we don't pass in any of our CFLAGS, since the Linux buildsystem itself figures out what flags are needed. However, this means that we don't pass in -Werror when --enable-checking is turned on, so warnings may not cause the build to fail. To fix this, create a new autoconf variable, called CFLAGS_WERROR, that only contains -Werror if --enable-checking is turned on. We then pass that into the Linux module buildsystem, so -Werror is given to the compiler when building our module. Change-Id: I0f1ec8b1a8096d10642c67b86314604c20ea2c60 Reviewed-on: https://gerrit.openafs.org/13682 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 13acb6fbefd6c4f4af951270ca07a1a5541052fa Author: Andrew Deason Date: Sun Jul 21 19:21:44 2019 -0500 afs: Free afs_thiscell during shutdown Currently, afs_thiscell can be allocated (via strdup) during client startup, but is never freed. Free it in shutdown_cell() to avoid leaking the memory. Change-Id: I77954ef35f949c8a638ba15615148ab784f7f48f Reviewed-on: https://gerrit.openafs.org/13714 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 82118acb6ed4f6fb90f3d864f4045d9c6bc2a55c Author: Andrew Deason Date: Sun Jul 21 17:58:48 2019 -0500 afs: Introduce shutdown_dynroot() Add a shutdown sequence for dynroot, which frees the afs_dynrootDir and afs_dynrootMountDir blobs, if they exist. Otherwise, we can leak the memory allocated for those blobs. Change-Id: I80fe41a0fcacbd272677ff778cd4ba51399f32f9 Reviewed-on: https://gerrit.openafs.org/13713 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit ad1fe5e1a825a3b3f88c04fd84613e4105206443 Author: Andrew Deason Date: Sun Jul 14 22:53:39 2019 -0500 FBSD: Remove unnecessary explicit osi_fbsd_alloc AFS_KALLOC is already defined to be osi_fbsd_alloc on FBSD, so this extra #ifdef here is completely unnecessary. Remove it. Do the same for AFS_KFREE/osi_fbsd_free. Change-Id: I3e42ec433a732402cc9de9ba9c035774ec29c2a5 Reviewed-on: https://gerrit.openafs.org/13708 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit d13b647aa392e1d802be1023930a8e1a07fb11ab Author: Andrew Deason Date: Sat Jul 20 23:09:27 2019 -0500 FBSD: Give 0 'rootrefs' to vflush on unmount Currently, in afs_unmount, we give vflush a 'rootrefs' arg of 1, indicating that we hold 1 reference on the root vnode. But ever since commit 6eb1088a (freebsd: properly track vcache references), we drop the ref for the root vnode at the beginning of this function. What happens currently in afs_unmount for a normal successful umount is something like this (at least, on FreeBSD 11.2-RELEASE): - We afs_PutVCache the afs_globalVp vcache, reducing its v_usecount and v_holdcnt to 0, and afs_globalVp is set to NULL. - vflush calls afs_root() to get the root vnode, which sees that afs_globalVp is NULL, and so calls afs_GetVCache for the root fid and returns it (and sets afs_globalVp to that vcache), with a v_usecount of 1. - vflush tries to vgonel() all of our vnodes, which calls our afs_vop_reclaim, which calls afs_FlushVCache(). For the root vnode specifically, vflush() sees that v_usecount is nonzero, and so skips calling vgonel() at first, but later calls vgone() on it specifically because we gave a nonzero 'rootrefs'. The resulting afs_FlushVCache() for the root vnode fails, because the root vnode's v_usecount is still 1. Since a failure from afs_vop_reclaim would cause a panic, we just log a warning and try to continue on anyway. - vflush() calls vrele() on the root vnode, right before returning. All of this allows the unmount to proceed, but this means that most of afs_FlushVCache() doesn't actually run for the root vcache, and it means we always log a warning like this on unmount: afs_vop_reclaim: afs_FlushVCache failed code 16 [...] In addition, this means that setting afs_globalVp at the beginning of afs_unmount() is largely pointless, since it gets set to a vcache again near the beginning of vflush(). To avoid all of this, stop lying to vflush about how many references to the root vnode we hold, and just say that we hold 0 references. Change-Id: Ib434c5fc48e67c3863fcad41279c3d9e0e0b8c2b Reviewed-on: https://gerrit.openafs.org/13709 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit f5acf1b1bfe940faf0a6f4bd11c55d6c90f60242 Author: Tim Creech Date: Sun Mar 5 18:17:23 2017 -0500 FBSD: Handle F_UNLCK in VOP_ADVLOCK When a_fl->type is F_UNLCK, FreeBSD gives our VOP_ADVLOCK an a_op of F_UNLCK, instead of F_SETLK like we expect. This causes afs_lockctl to return EINVAL, since F_UNLCK isn't a normal fcntl lock op, and so userspace requests to unlock fcntl-style locks always fail. This can be seen, for example, when trying to use sqlite3 to access a database that lives in afs. This F_UNLCK behavior in FreeBSD seems a bit peculiar, but has been around effectively forever (since 4.4BSD-Lite). So just work around it. [adeason@dson.org: minor style adjustments and commit message/comment rewording.] Change-Id: I8bfaff9274e40761aa291930430a08b83b524d1b Reviewed-on: https://gerrit.openafs.org/12579 Reviewed-by: Tim Creech Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit ee7019a7630d01f29fecebd89ca69ad8a37e24e2 Author: Andrew Deason Date: Mon Jul 15 16:24:10 2019 -0500 afs: Fix a few ARCH/osi_vcache.c style errors Most of the ARCH/osi_vcache.c implementations were defining functions like: void osi_foo(args) { /* impl */ } But our prevailing style is: void osi_foo(args) { /* impl */ } Fix them to follow our prevailing style, and fix a couple of the more obvious errors with identation and goto label. Change-Id: Ie752ee67aa6acfec3bf9a28d7da41151f95fbbf6 Reviewed-on: https://gerrit.openafs.org/13699 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit cba7c62f56f2a98b843fe6f83e22bc03f832e9aa Author: Andrew Deason Date: Mon Jul 15 17:51:41 2019 -0500 afs: Check for invalid afs_fakestat_enable values The only valid values for afs_fakestat_enable right now are 0, 1, and 2. Check if the given value actually matches one of those, in case we have mismatched libafs/afsd versions, and future code adds new values. Return EINVAL and log a message if we're given an unknown value. Change-Id: I36ad4263e7e3ab311f6edb97a9c48edc035f6753 Reviewed-on: https://gerrit.openafs.org/13698 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit ca472e66fb97572784be429ec264e0e38d1d546b Author: Andrew Deason Date: Tue Aug 14 15:54:29 2018 -0500 LINUX: Turn on AFS_NEW_BKG AFS_NEW_BKG allows libafs to request the afsd background daemon processes to do certain userspace operations. This is currently only used on DARWIN for handling EXDEV file moves, but this framework can be useful on LINUX, as well. So, turn it on for LINUX. This commit does not introduce any new background operations for LINUX to actually use; we're just turning on the new framework. Future commits will introduce new background operations. Change-Id: I5d371f85b87899ce6ab2d5e520954a893679d37e Reviewed-on: https://gerrit.openafs.org/13284 Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit d29ae454adfd135ca434d6d94968b5929efc8e46 Author: Andrew Deason Date: Wed Jul 10 16:24:11 2019 -0500 afs: Remove reference to nonexistent function The real lie here is that TellALittleWhiteLie exists in afs_vcache.c. That has never been true, ever since OpenAFS 1.0. Change-Id: I5ba121db5b4f0bbe7a37054a3d2d8c46f6c49c0a Reviewed-on: https://gerrit.openafs.org/13697 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 3b0a9ff6af68c88d656aefe2242f12a7a9e04969 Author: Andrew Deason Date: Wed Jul 10 12:42:44 2019 -0500 afs: Remove useless afs_GetVCache arguments The 'avc' argument in afs_GetVCache has never been used, all the way back to OpenAFS 1.0. The 'cached' argument was set correctly, but none of its callers ever looked at the result of 'cached'. Remove these useless arguments. afs_LookupVCache and afs_GetRootVCache also had the same 'cached' argument, which was also never used by callers. Remove it for those, as well. Change-Id: I3536259f26536acc02fbb058787f417bf0f50b9a Reviewed-on: https://gerrit.openafs.org/13681 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 2b7af1243f46496c0b5973b3fa2a6396243f7613 Author: Cheyenne Wills Date: Fri Aug 9 14:25:03 2019 -0600 LINUX 5.3.0: Use send_sig instead of force_sig Linux 5.3.0 commit 3cf5d076fb4d48979f382bc9452765bf8b79e740 "signal Remove task parameter from force_sig" (part of siginfo-linus branch) changes the parameters for the Linux kernel function force_sig. See LKML thread starting at https://lkml.org/lkml/2019/5/22/1351 According to the LKML discussion and the above commit message force_sig is only safe to deliver a synchronous signal to the current task. To send a signal to another task, we're supposed to use send_sig instead, which has been available since at least linux 2.6.12-rc12. Currently, rx_knet calls force_sig to kill the rxk_ListenerTask. With the Linux 5.3.0 kernel, this module fails to compile due to the above noted changes. Replace the force_sig call with send_sig. In order to use send_sig, the rxk_listener thread must allow SIGKILL and during shutdown (umount) SIGKILL must be unblocked for the rxk_listener thread. Note that SIGKILL is initially blocked on rxk_listener and is only unblocked when shutting down the thread. Having the signal blocked is sufficient to prevent unwanted signals from reaching the rxk_listener thread during normal operation. Change-Id: I0c31d66f4ecd887ff9253ba506565592010e8bcb Reviewed-on: https://gerrit.openafs.org/13753 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 02d82275c17284d04629282aa374bb39f511c989 Author: Cheyenne Wills Date: Thu Aug 8 16:53:13 2019 -0600 LINUX 5.3.0: Check for 'recurse' arg in keyring_search Linux 5.3.0 commit dcf49dbc8077e278ddd1bc7298abc781496e8a08 "keys: Add a 'recurse' flag for keyring searches" adds a new parameter to Linux kernel keyring_search function. Update the call to keyring_search to include the recurse parameter if available. Setting the parameter to true (1) maintains the current search behavior. Change-Id: I54b7ed686bf1fb4c42789e5d251ae76789e9fc88 Reviewed-on: https://gerrit.openafs.org/13752 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk Reviewed-by: Andrew Deason commit e3dbd8a5886734f6390126e155cc259b0de5af51 Author: Cheyenne Wills Date: Thu Aug 8 12:07:51 2019 -0600 rxkad: ticket5.c fix typo in #if statement commit 98ca332c4a5ac9e5687fb4fe21b350134bc74d1b (rxkad: v5der.c format truncation warnings) contains a typo in the test for clang (_clang instead of __clang__) Correct the typo in the #if statement to test for __clang__ Change-Id: I0dbe603072740fcf2fb2cb2cea464a48009fee74 Reviewed-on: https://gerrit.openafs.org/13754 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit cc7f942a81a3bbdc8154f511d054a2a018b39ce5 Author: Andrew Deason Date: Wed Jul 10 23:40:55 2019 -0500 LINUX: Disable kernel fortuna large frame errors The rand-fortuna.c we get from Heimdal's hcrypto currently sometimes causes a warning on LINUX when building in the kernel, because fortuna_reseed() has a (potentially) large stack size: .../src/libafs/MODLOAD-.../rand-fortuna-kernel.c:549:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] Currently this does not cause the build to fail, even with --enable-checking, since -Werror is not given in the CFLAGS when building our kernel module. But if -Werror is passed in CFLAGS (in a future commit), this would cause the build to fail. Since this is an external source file, we cannot change it directly. At least for now, just prevent this warning from breaking the build by passing -Wno-error=frame-larger-than= into the CFLAGS for that file. Change-Id: Ieefdf2dbc318fdcd559435e5f329eef5cf9bb9ba Reviewed-on: https://gerrit.openafs.org/13684 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit bf24b301a10dcb5710a98e58252213bd72c6f352 Author: Cheyenne Wills Date: Fri Aug 2 10:31:13 2019 -0600 restorevol: replace snprintf with asprintf GCC is generating format-truncations warnings. With newer levels of gcc (e.g. gcc8) and --checking-enabled these warnings result in errors and failed builds. In addition clang8 static analysis tools are reporting memory leaks. Replace snprintf with asprintf and eliminate some of the large work buffers that are being placed on the stack. In order to correct some of the format-truncation errors the size of the buffers grew significantly (e.g. gcc is reporting the need to resize some of the buffers from 256 bytes to 4K in order to eliminate the warnings). Ensure allocated work buffers are freed before function return. Obtained a clean build with gcc9/clang8 with --enable-checking and a clean scan-build report with clang8. Change-Id: Ie8e22fdff2e0ba6494b1b449f413ecbe38f367bd Reviewed-on: https://gerrit.openafs.org/13494 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit e6b97b337bc97fdb1c8e4f1a0572c62dfc82d979 Author: Andrew Deason Date: Mon Jul 29 18:17:21 2019 -0500 afs: Skip IsDCacheSizeOK for CDirty/VDIR IsDCacheSizeOK currently can incorrectly flag a dcache as corrupted, since the size of a dcache may not match the size of the underlying file in a couple of RW conditions: - If someone is writing to a file beyond EOF, the intermediate 'sparse' area may be populated by 0-length dcaches until the data is written to the fileserver. - Directories may be modified locally instead of being fetched from the fileserver, which can sometimes result in a directory blob of differing sizes. To avoid false positives detecting dcache corruption, just skip the IsDCacheSizeOK check for directories, and any file with pending writes (CDirty). Also add some extra information to the logging messages when this "corruption" is detected, so false positives may be more easily detected in the future. Change-Id: I5130287d0de791cffea85aaec5a0899d5c8d092e Reviewed-on: https://gerrit.openafs.org/13747 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit d6262c3f391e4176bec207fd0e8d4d6091a7f4e2 Author: Cheyenne Wills Date: Fri Jul 26 14:57:02 2019 -0600 gtx: Avoid incomplete function type in casts clang complains that these casts contain an incomplete function type (since the function argument is omitted rather than declared to be void). Since we just need the cast to pointer type, let the compiler do it implicitly and pass stock NULL, rather than trying to force a cast to function-pointer type. Change-Id: Ia2a4cf61d51faef3b4cd469133d9143ca5f57185 Reviewed-on: https://gerrit.openafs.org/13726 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 5792e0211be275cf79d10e8c5f6ab2a14493e07a Author: Yadavendra Yadav Date: Fri Jul 26 19:59:25 2019 +0530 LINUX: Avoid re-taking global lock in afs_dentry_iput “dput” function internally can call dentry_iput which results in calling afs_dentry_iput. So in case before calling “dput” if global lock was held then when afs_dentry_iput is called it will again try to lock global lock and will result in deadlock scenario. So to avoid this deadlock make sure if global lock is already taken before calling afs_dentry_iput, don’t try to lock it again. This issue was partially fixed in commit 0dac4de8 (Linux: drop GLOCK before calling dput) Change-Id: I71f18c58d5254f0cf0c68ef04c22268ed70dd50f Reviewed-on: https://gerrit.openafs.org/13725 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 104a9d357da9452305694e97752fe6313fcd22c0 Author: Michael Meffie Date: Wed Jul 24 11:39:43 2019 -0400 build: fix --enable-rxgk help format Move the dnl macros out of the AC_ARG_ENABLE to fix the formatting of the --enable-rxgk help string. Before this commit: $ ./configure --help | grep -C2 rxgk --enable-kauth install the deprecated kauth server, pam modules, and utilities (defaults to disabled) --enable-rxgk Include experimental support for the RXGK security class (defaults to disabled) --disable-strip-binaries After this commit: $ ./configure --help | grep -C2 rxgk --enable-kauth install the deprecated kauth server, pam modules, and utilities (defaults to disabled) --enable-rxgk Include experimental support for the RXGK security class (defaults to disabled) --disable-strip-binaries Change-Id: Iaf6695643f11c7b636e3fba33ee7161e21df23a6 Reviewed-on: https://gerrit.openafs.org/13722 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 4a57cc54dfb6789a86ee735360ee44209c1a901a Author: Cheyenne Wills Date: Tue Jul 2 16:58:28 2019 -0600 ptserver: testpt.c format-overflow warning GCC 9 introduced new warnings/errors and is flagging a sprintf with a format-overflow warning. With --checking-enabled, this error is causing testpt.c to fail during compile. Change the buffer size from 16 bytes to PR_MAXNAMELEN+1 and use snprintf instead of sprintf. Generate an error message and exit if snprintf truncates the string. Change-Id: I30fbe0971ba3e05dc6ac61e7b2ded2fd1777374d Reviewed-on: https://gerrit.openafs.org/13663 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 41ee558329560bce037ad2860282d8b49aa11b2d Author: Cheyenne Wills Date: Fri Jul 26 07:59:33 2019 -0600 uss: uss_procs.c format-overflow warning GCC 9 introduced new warnings/errors and is flagging a sprintf with a format-overflow warning. With --checking-enabled, this error is causing uss_procs.c to fail during compile. A file name with the full path is being composed and the size of the buffer was triggering a possible format-overflow warning/error. Use asprintf to allocate the buffer dynamically instead of using a buffer sitting on the stack (reducing the stack requirements by 2K). Produces new error message if asprintf returns an error. Change-Id: Ib233052aab9c3bc1ec24dac7e70f97933b478d3e Reviewed-on: https://gerrit.openafs.org/13664 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit f938f5f248a3cb3f7ac871f5ef45a0e2d043706b Author: Cheyenne Wills Date: Tue Jun 25 15:39:40 2019 -0600 ptserver: Incorrect variable used to print error msg In testpt.c the variable cdir is used to print the name of the temporary dir. However at this point in the code cdir is NULL and the variable tmp_conf_dir contains the actual name that should be used in the error message. Flagged as an error when --enable-checking is on and using GCC 9. Change-Id: I0c854fd89c0bae1c313ae1f382e58fd410b719e6 Reviewed-on: https://gerrit.openafs.org/13662 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 98ca332c4a5ac9e5687fb4fe21b350134bc74d1b Author: Cheyenne Wills Date: Mon Jul 15 08:38:24 2019 -0600 rxkad: v5der.c format truncation warnings GCC 7 is producing new warnings due to better compile time analysis. With --enable-checking v5der.c is failing with 2 errors due to possible format-truncation in some snprintf calls. The format strings are being used to format a date and time values from a tm structure. The actual warnings/errors are being triggered from arithmetic being performed on the year and month members of the structure. The resulting values should not exceed the format lengths, but the compilers are still flagging the statements. v5der.c is part of the heimdal package that is pulled into the openafs source tree. v5der.c is not compiled directly but is #included in ticket5.c Update ticket5.c to change the severity of the format-truncation diagnostic to a warning if using GCC 7 (or higher). Note: since v5der.c is pulled from an external source (heimdal), any changes to update v5der.c directly would need to be performed upstream. Change-Id: Icda0d86444f505604abe9fa1cc2450d7538be7ef Reviewed-on: https://gerrit.openafs.org/13661 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit eaae6eba8ca10ba7a5a20ee0d1b5f91bc2bac6c6 Author: Benjamin Kaduk Date: Thu Jul 11 21:07:35 2019 -0700 aklog: require opt-in to enable single-DES in libkrb5 Since the introduction of rxkad-k5 in response to OPENAFS-SA-2013-003, it is not strictly necessary to configure libkrb5 to allow weak crypto in order to obtain an AFS token. A sufficient amount of time has passed since then that it is safe to assume that the default behavior is the more-secure one, and require opt-in for the insecure behavior. To indicate that the use of single-DES is quite risky, add the "-insecure_des" argument to both klog and aklog, to gate the preexisting calls that enable weak crypto/single-DES. These calls, and the -insecure_des option, may be removed entirely in a future commit. Change-Id: If175d0f95f0ede0f252844086a2a023da5580732 Reviewed-on: https://gerrit.openafs.org/13689 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 5f48367f2bd5bf1c0e689c79508177b649b9113b Author: Andrew Deason Date: Mon Mar 25 16:33:39 2019 -0500 afs: Avoid non-dir ENOENT errors in afs_lookup Historically, there have been many subsystems in libafs that can generate ENOENT errors for a variety of reasons. In addition to the expected case where we lookup a name that doesn't exist, other scenarios have caused ENOENT error codes to be generated, such as: internal inconsistencies, I/O errors, or even abort codes from the network. When one of these scenarios cause an ENOENT error code in one of those situations during afs_lookup() when the target name does actually exist, it can be confusing to a user, or even result in incorrect application behavior. On Linux in particular, ENOENT results from a lookup are cached in negative dcache entries, and so can cause future lookups for the same name to yield ENOENT errors. Various commits have tried to avoid this abuse of the ENOENT error code, such as 2aa4cb04 (afs: Stop abusing ENOENT). But we cannot prevent receiving ENOENT abort codes from the network, and mistakes in the future may cause more scenarios incorrectly yielding ENOENTs. However, in afs_lookup, we do know that legitimate ENOENT errors can only occur in one situation: when we have a valid directory blob, and the afs_dir_Lookup() operation itself returns an ENOENT error for the target name. For all other areas of afs_lookup(), we know that an ENOENT error is not legitimate, since we may not be sure if the target name exists or not. So to proactively avoid incorrect ENOENT results, prevent afs_lookup from returning ENOENT, except in the specific code path where afs_dir_Lookup is called. Change-Id: I1c91600fd38b1179f02fa6eadea631b6eb8edb6d Reviewed-on: https://gerrit.openafs.org/13537 Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit fa15fbda0aa0c3810695d9b867d3258b60e76b7c Author: Andrew Deason Date: Tue Jul 24 23:22:01 2018 -0500 LINUX: Minor osi_vfsop.c cleanup - Fix the formatting on afs_mount/afs_get_sb definitions - Declare a couple of functions static that are not referenced outside of this file Change-Id: I4880c27dbe2acd296262d29f91736d0028a029c0 Reviewed-on: https://gerrit.openafs.org/13282 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 397199a1992d74d8b7e693a2d76df836f7a70080 Author: Andrew Deason Date: Tue Aug 14 15:53:20 2018 -0500 afs: Add AFS_USPC_SHUTDOWN bkg request When AFS_NEW_BKG was added, the kernel module indicated to the relevant afsd process that it's time to shutdown by returning -2. This works on DARWIN, but it's difficult to make this work on all platforms, because of the different way that platforms handle error codes from our pioctls and other AFS syscalls. Specifically, on LINUX, negative error codes are assumed to be negative errno codes, and so returning -2 from the syscall handler means we return -1 to userspace, with errno set to 2 (ENOENT). Getting this to work consistently across platforms is probably more trouble than its worth, so instead of relying on specific return codes from the syscall, just add a new background daemon operation called AFS_USPC_SHUTDOWN, which just tells the background daemon to exit. Change-Id: I00b245c8f734dc9e49d6b4268cd0f6a4f1896894 Reviewed-on: https://gerrit.openafs.org/13281 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 79dffe29c8a0ec55c4231a18077efdfa7c1edf53 Author: Cheyenne Wills Date: Fri Jul 5 08:23:10 2019 -0600 libadmin: overlap warning in strcpy with gcc9 GCC 9 with --enable-checking produces a new warning/error in afs_utilAdmin.c associated with a strcpy with the potential of an overlap. The index used is signed which triggers the new warning. The source and target of the strcpy are contained within the same higher level structure. Change the variable 'index' from signed to unsigned to resolve the warning/error. Change the variable 'total' in the same structure to unsigned to be consistent with it's usage with 'index'. Change-Id: Icaa99e278a5d8262caeaec0b2723e826a57554aa Reviewed-on: https://gerrit.openafs.org/13660 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 7c60a0fba11dd24494a5f383df8bea5fdbabbdd7 Author: Andrew Deason Date: Thu Jan 17 16:21:25 2019 -0600 afs: Check dcache size when checking DVs Currently, if the dcache for a file has nonsensical length (due to cache corruption or other bugs), we never notice, and we serve obviously bad data to applications. For example, the vcache metadata for a file may say the file is 2k bytes long, but the dcache for that file only has 1k bytes in it (or more commonly, 0 bytes). This situation is easily detectable, since the dcache and vcache refer to the same version of the same file (when the DVs match), and so we can check if the two lengths make sense together. So to avoid giving bad data to userspace applications, perform a sanity check on the lengths at the same time we check for DV matches (to see if the dcache looks "fresh" and not stale). If the lengths do not make sense together, we just pretend that the dcache is old, and so we'll ignore it and fetch a new copy from the fileserver. Also check the size of the data fetched from the fileserver for a newly-fetched dcache in afs_GetDCache, to avoid returning a bad dcache if the dcache isn't already present in the cache. Change-Id: I338a4962322d8c0d06d1ea25fd7d252b5f83dc9f Reviewed-on: https://gerrit.openafs.org/13436 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit eed79e2d28dcab889d01869e57dec14fd30d421c Author: Andrew Deason Date: Wed Jul 3 12:55:53 2019 -0500 LINUX: Unlock page on afs_linux_read_cache errors When afs_linux_read_cache is called with a non-NULL task, it is responsible for unlocking 'page' (unless it's unlocked in a background task), even if we encounter an error. Currently we almost always do unlock the given page for a non-NULL task, but if we manage to hit one of the codepaths that 'goto out', we skip over the unlock_page() call near the end of the function, and the page never gets unlocked. As a result, the page stays locked forever. That generally means any future access to the same file will block forever, and when we try to flush the relevant vcache, we will block waiting for the page lock while holding GLOCK. (This can happen via the background daemon via e.g. afs_ShakeLooseVCaches -> osi_TryEvictVCache -> afs_FlushVCache -> osi_VM_FlushVCache -> vmtruncate -> ... -> truncate_inode_pages_range -> __lock_page on Linux 2.6.32-754.2.1.el6.) This quickly brings the whole client to a halt until the machine can be forcibly rebooted. To solve this, just move the 'out:' label to before the page unlock. Add a few locking-related comments around the relevant code to help explain some relevant details. The relevant code has changed and been refactored over the years, but this problem has probably existed ever since this code was originally converted to using the readpage() of the underlying cache fs, in commit 88a03758 (Use readpage, not read for fastpath access). Change-Id: If7e882ed54ca93ad6b9fdda938c606b241236241 Reviewed-on: https://gerrit.openafs.org/13672 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 0d8ce846ab2e6c45166a61f04eb3af271cbd27db Author: Andrew Deason Date: Thu Jan 17 15:45:36 2019 -0600 afs: Introduce afs_IsDCacheFresh Numerous places in libafs check the DV of a dcache against the DV of the vcache for the same file, in order to check if the dcache is up to date and can be used. Consolidate all of these checks into a new function, afs_IsDCacheFresh, to make it easier for future commits to alter this logic. This commit should have no visible impact; it is just code reorganization. Change-Id: Iedc02b0f5d7d0542ab00ff1effdde03c2a851df4 Reviewed-on: https://gerrit.openafs.org/13435 Reviewed-by: Benjamin Kaduk Tested-by: Andrew Deason commit fb9de9e5fd4822df043a0d46e6a1101df2e08b85 Author: Andrew Deason Date: Thu Nov 15 12:37:16 2018 -0600 afscp: Add -l option Add the -l option to afscp, to "loop" the given FetchData/StoreData request over and over. When using this mode, we alternate between using a couple of rx calls, to avoid getting slowed down by rx BUSY packets when we start a new call on the same channel too quickly. Change-Id: I90ee8e9804a0bf59ff654398b1fe6e46a99a3062 Reviewed-on: https://gerrit.openafs.org/13657 Reviewed-by: Benjamin Kaduk Tested-by: Andrew Deason commit b0278994826f6bd1dfebc39f26282b8fbdadf1a0 Author: Mark Vitale Date: Wed May 22 22:50:00 2019 -0400 auth: make PGetTokens2 work with 3-char cellnames PGetTokens2 accepts two different types of input: - an integer 'iterator' to request the nth token set for a user - a string cellname to request the user's token set for that cell Unfortunately, it distinguishes between these by assuming if the input length is sizeof(afs_int32) (4 bytes), it must be an integer. This assumption is incorrect if the cellname is three (3) characters long plus a nul terminator. The result is that the cellname string is interpreted as a very large "n"; the subsequent search for the user's "very-large-nth-token" fails, making it appear that the user has no valid token for this cell. Improve on this heuristic by double-checking any putative integer input. If it is actually a 3-character string, then process the input as a cellname instead. Introduced by commit 5ec5ad5dcca84e99e5f55987cc4f787cd482fdde 'New GetToken pioctl'. While here, add doxygen comments. Change-Id: Ifa226fa1c35b95bc32642870f73359f97a9f1d61 Reviewed-on: https://gerrit.openafs.org/13599 Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk Reviewed-by: Andrew Deason commit 95ae30c30d98a3219fd021e0ed83200c1b6c266f Author: Mark Vitale Date: Wed May 22 23:03:11 2019 -0400 auth: eliminate pointless retries in ktc_ListTokensEx ktc_ListTokensEx is an iterator to provide the names of each cell for which a user has a token set. It does this by looking for the 1 through nth token set for a given user. However, as currently implemented, it always continues searching up to the 100x safety limit even when there are no more token sets for the user. Instead, return immediately when VIOC_GETTOK2 returns EDOM (no more tokens for this user). Introduced by commit a86ad262d2a8be36f43ab0885a84dde37ddfc464 'auth: Add the ktc_ListTokensEx function'. Change-Id: I880edc80fc6c5580e5919b74b0b561317a1455f0 Reviewed-on: https://gerrit.openafs.org/13598 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 4eeed830fa31b7b8b5487ba619acbc8d30642aaa Author: Andrew Deason Date: Wed Jun 26 17:03:03 2019 -0500 afscp: Link against opr/roken/hcrypto Link afscp against libopr, libroken, and libafshcrypto, so afscp can be built again. Change-Id: I43ac3a8e7ed1ff012f4ae48ed6b81f5d0cd1d590 Reviewed-on: https://gerrit.openafs.org/13656 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit f5f59cd8d336b153e2b762bb7afd16e6ab1b1ee2 Author: Cheyenne Wills Date: Tue Jun 25 10:40:53 2019 -0600 util: serverLog using memory after free clang's scan-build detected a "use of memory after it is freed" condition. The function OpenLogFile frees the variable ourName before creating a duplicate of the name passed to it. However there is a call that uses ourName as the parameter: OpenLogFile(ourName). This results in freeing ourName then doing a strdup of the same memory location. Test the passed parameter and if it's the same as ourName already skip the free and strdup. This bug was introduced in commit 340ec2f79208ee21c3130c4b1c13995947ce426c "util: allocate log filename buffers" Change-Id: I770008b074e0003c7c1532128f8322da811d6fcc Reviewed-on: https://gerrit.openafs.org/13659 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 1210a8d6d96db2d84595d35ef81ec5d176de05e8 Author: Andrew Deason Date: Fri Jun 28 14:14:48 2019 -0500 LINUX: Run the 'sparse' checker if available The Linux kernel module buildsystem supports running an external tool (by default, the 'sparse' tool) during the build to run additional static checks on the source code to flag various warnings. Tell the kernel build to run such a tool, if 'sparse' is installed. This causes various new warnings in the build, such as: CHECK /.../src/libafs/MODLOAD-4.9.0-8-amd64-MP/afs_tokens.c /.../src/libafs/MODLOAD-4.9.0-8-amd64-MP/afs_tokens.c:73:1: warning: symbol 'afs_FreeOneToken' was not declared. Should it be static? /.../src/libafs/MODLOAD-4.9.0-8-amd64-MP/afs_tokens.c:160:1: warning: symbol 'afs_IsTokenExpired' was not declared. Should it be static? /.../src/libafs/MODLOAD-4.9.0-8-amd64-MP/afs_tokens.c:187:1: warning: symbol 'afs_IsTokenUsable' was not declared. Should it be static? None cause the build to fail currently, but are just printed for potential further investigation. To control detecting 'sparse', add the --with-sparse configure option and SPARSE configure variable. Default to checking if sparse is available, and enabling it if so. Further information on using sparse in the Linux kernel is available in Documentation/sparse.txt in the Linux tree. Using 'sparse' during the build was suggested by yadayada@in.ibm.com. Change-Id: I57944d792ba1c8093196a8b335a12dfa741b119b Reviewed-on: https://gerrit.openafs.org/13665 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 3f0b9907d12c00725548dbaf84fee3e033cb974c Author: Pat Riehecky Date: Tue Jun 12 13:55:56 2018 -0500 afs: test condition mismatch resolved While it is unexpected, it is possible for the two disconnected flags to get out of sync resulting in a path to an undefined varible in use. (via cppcheck) Change-Id: I995b402e73c2c330485050dd2594a62fe67d1bca Reviewed-on: https://gerrit.openafs.org/13207 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit fbe2a03aa69bc19768302685d902a25e4d6e157a Author: khm Date: Tue Jun 25 12:51:21 2019 -0700 add dkms dependency in Red Hat unit file Currently, there is no explicit relationship between OpenAFS and dkms. If dkms needs to rebuild the kernel module, OpenAFS will fail to mount because modprobe will not load the module. This change specifies that OpenAFS should run after dkms if dkms is present. Change-Id: I104cb3780bbc1196cf36852f094ca07c80279d01 Reviewed-on: https://gerrit.openafs.org/13654 Tested-by: BuildBot Reviewed-by: Michael Laß Reviewed-by: Andrew Deason Reviewed-by: Michael Meffie Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 877d9d79a32b9e81911cb567f844b11c693229f0 Author: Andrew Deason Date: Tue Oct 30 15:41:22 2018 -0500 aklog: Avoid misleading AFSCELL message Currently, if the AFSCELL environment variable is set, aklog (and other libauth-using utilities) print out a message when afsconf_GetLocalCell is called: Note: Operation is performed on cell env.example.com However, this message is also printed (with the AFSCELL cell) when aklog is given the -cell command-line argument, even though aklog actually uses the cell given on the command line. For example: $ AFSCELL=env.example.com aklog -cell cli.example.com -d Note: Operation is performed on cell env.example.com Authenticating to cell cli.example.com (server srv1.example.com). [...] libauth will normally not print the "Operation" message if we're not using the default cell, but it determines this by checking if someone called afsconf_GetCellInfo before calling afsconf_GetLocalCell. And currently, aklog calls afsconf_GetLocalCell before afsconf_GetCellInfo, so the message gets printed because libauth has no way of knowing that we're actually using a different cell. klog gets around this by making an additional ignored call to afsconf_GetCellInfo before afsconf_GetLocalCell, but we can fix this in aklog by just changing the order of the calls. So, just call afsconf_GetCellInfo first; if we're using the local cell, we can just give a NULL cell parameter, instead of looking up the local cellname first. Change-Id: I53469ee93d6e88632a944a87a031e0ffa4ede584 Reviewed-on: https://gerrit.openafs.org/13371 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit e14a69cf925172d699c2ff31078f8a634a90747f Author: Andrew Deason Date: Sat Dec 8 15:08:26 2018 -0600 rx: Set listener pthread name When running under pthreads, set the name of the rx listener thread to "rx_Listener". This can be handy when investigating rx performance issues, since it makes it easier to identify which thread in the rx listener. Don't do this for "hot threads", since in that case we could return and stop being a listener thread. We could restore the original thread name, but doing so could have an impact on performance and "hot threads" should always be disabled these days, so don't bother. Change-Id: I24aebd4d7e4266cd06bb1a4314949d85835dfbaa Reviewed-on: https://gerrit.openafs.org/13600 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 9d28f7390332c92b3d9e863c6fe70c26db28b5ad Author: Andrew Deason Date: Wed Jun 26 11:47:21 2019 -0500 Move afs_pthread_setname_self to opr Move the functionality in afs_pthread_setname_self from libutil to opr, in a new function opr_threadname_set. This allows us to more easily use the routine in more subsystems, since most code already uses opr. Change-Id: I79d49617a19cd292a3b09ccfd9c9f319355a184e Reviewed-on: https://gerrit.openafs.org/13655 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 99418024276c94da5982d7dad6126a8d53924d7e Author: Andrew Deason Date: Sun Jun 23 17:48:53 2019 -0500 libafs: Create $(DESTDIR)$(KMODDIR) on FBSD inst We rely on bsd.kmod.mk for our actual rules during 'make install', but that tries to install our kernel module into $(DESTDIR)$(KMODDIR), without creating it first. If the user tries to 'make install DESTDIR=/some/path' and that path doesn't exist, we will fail with something like: make DESTDIR=/home/adeason/git/destdir single_instdir_libafs /usr/bin/install -c -T release -o root -g wheel -m 555 libafs.ko /home/adeason/git/destdir/boot/modules/ install: /home/adeason/git/destdir/boot/modules/: No such file or directory *** Error code 71 To avoid this, add a dependency on the 'install' target which causes our target dir to be created. Change-Id: Icacc507867420265383e411572006df47ef22815 Reviewed-on: https://gerrit.openafs.org/13653 Tested-by: BuildBot Reviewed-by: Tim Creech Reviewed-by: Benjamin Kaduk commit 85d70ea953c6fb44f200ed4be13cded7413559b8 Author: Andrew Deason Date: Sun Jun 23 16:25:27 2019 -0500 asetkey: Fix random_key for Heimdal Go through our deref_key_length/deref_key_contents abstractions, so we can compile with Heimdal krb5. Also fix these macros to properly separate the 'key' macro argument, so we can use the macros in these new places. Change-Id: I3ee53bc70494a67ac5463819dc575c8ee37647c9 Reviewed-on: https://gerrit.openafs.org/13652 Tested-by: BuildBot Reviewed-by: Tim Creech Reviewed-by: Benjamin Kaduk commit 34fd532e35b6f373304effaa16c9c65062b12cd9 Author: Andrew Deason Date: Wed Aug 1 18:38:51 2018 -0500 DARWIN: Use tb->code_raw for BOP_MOVE Currently, BOP_MOVE communicates its error code to the requestor via the 'retval' field in struct afs_uspc_param, and we assume ptr_parm[0] of the given brequest is for a struct afs_uspc_param. But this is unnecessary, since struct brequest already has fields for error codes; namely, code_raw and code_checkcode. To avoid afs_BackgroundDaemon needing to interpret ptr_parm[0] in this way (and assuming the type of the pointer's target), change BOP_MOVE to just use the code_raw field for error codes, instead of interpreting ptr_parm[0]. This makes it easier to add more AFS_NEW_BKG background operations that do not pass a struct afs_uspc_param in the brequest parameters. Change-Id: I90a564468862142777159fbb78234744840b59fb Reviewed-on: https://gerrit.openafs.org/13280 Reviewed-by: Mark Vitale Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 0c1d124b0b6ea3117885d2bca163170515cb8713 Author: Andrew Deason Date: Mon Aug 20 15:47:13 2018 -0500 rxkad: Update ticket5 from heimdal This updates the rxkad code that we pull from heimdal to heimdal 7.7.0 (heimdal.git commit e1959605bd). This also updates the instructions in README.v5 to accommodate changes in the heimdal tree, and converts ticket5.c to use KRB5_ENCTYPE_* constants instead of ETYPE_* constants (since heimdal has also similarly converted in krb5_asn1.h). This removes a few -Werror=format-truncation warnings that were present in the heimdal code before this commit. README.v5 tweaked in collaboration with kaduk@mit.edu. Change-Id: I5fdaab600b4a1b42658a60259fde3fc9f7dced04 Reviewed-on: https://gerrit.openafs.org/13287 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 54c34d32e884a5bfb2352e7c8767d743ef3e4647 Author: Mark Vitale Date: Wed Jun 12 23:44:32 2019 -0400 afs: remove bogus comment from afs_IsTokenExpired Remove an incorrect comment, introduced with commit adf2e6e827c6caf55247c5e63b88775393156ae5 'Unix CM: Generalise token storage'. No functional change is incurred by this commit. Change-Id: Ie56c4f22a06321c56f62fce9704419ce3c4e7bf2 Reviewed-on: https://gerrit.openafs.org/13640 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 3a5ab19fe04058e002bfea90f8b64fab4676de67 Author: Benjamin Kaduk Date: Fri Apr 19 10:38:24 2019 -0500 afs: add a file-level comment to afs_osidnlc.c This file doesn't currently do a great job of telling the reader what it's used for. Let's give them a hint, especially for the expansion of "DNLC". Change-Id: Ie5d1f1162a4b59c479bc2961b33cd696e83bdc3a Reviewed-on: https://gerrit.openafs.org/13557 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 30a6ab30f2451b9788328336dd937a4263f5f5c7 Author: Andrew Deason Date: Tue Feb 26 20:47:00 2019 -0600 ptserver: Check for superuser in WhoIsThisWithName In WhoIsThisWithName, if we don't understand the rx security class being used (such as rxgk), we'll set the calling id to the anonymous user and return an error. But for SYSADMINID specifically, we don't really need to know any security-class-specific details; we just need to know that the caller is the superuser. So add a fallback case to check for that; if we don't understand the calling rx security class, just check if the calling user is RX_ID_SUPERUSER, and use SYSADMINID if so. This allows the ptserver to handle rxgk localauth requests (and theoretically, localauth requests for any future security classes), and theoretically any localauth requests for future security classes. Based on a commit from mvitale@sinenomine.net. Change-Id: Ia9bc91fb5a0d9ebf16b32659c9068aa5a9da8401 Reviewed-on: https://gerrit.openafs.org/13508 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 316b862af6b6731f57a21f81b0948f3718b4c9f3 Author: Mark Vitale Date: Mon Feb 11 01:21:08 2019 -0500 ptclient: rxgk support Allow ptclient to use rxgk, with the new -rxgk option. While we're here, also allow the user to specify a security level of 3, to turn on rxkad encryption for non-localauth conns. Change-Id: I201154c1b5298f31912d8841f8310363e13afa08 Reviewed-on: https://gerrit.openafs.org/13501 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit e5b1e6f1adbe10e366bb4d9c745e90193badc1fb Author: Benjamin Kaduk Date: Sun Apr 13 22:01:59 2014 -0400 Add rxgk client options to vl and pt utilities Add options to use rxgk for outgoing connections to vlserver, vos, ptserver, and pts. For vlserver and ptserver, name the new option -s2scrypt, similar to the existing volserver option -s2scrypt. For vlserver and ptserver, specify 'rxgk-crypt' to turn on rxgk crypt connections for our server-to-server ubik communication. For vos and pts, just name the new option '-rxgk', and allow the user to specify the rxgk level to use ('clear', 'auth', or 'crypt'). The pts code is currently somewhat ill-suited to changing what rx security class and security level we use, but do the best we can without refactoring the whole thing. Change-Id: Iefae46291330d2b5e05b2a2bbaec1b9150b3c892 Reviewed-on: https://gerrit.openafs.org/11105 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit fc7e1700fe84f623fb9163466d24226df00b1a2c Author: Mark Vitale Date: Wed May 22 22:52:10 2019 -0400 pioctl: limit fruitless token searches getNthCell searches the afs_users table for the nth token set belonging to a given user. However, it is impossible for a user to have more than one token set per cell. If the caller specifies a number greater than the total number of cells this cache manager knows about, we know the search will be fruitless. Instead, return early in this case, avoiding both the lock and the search. Change-Id: I509408d9aaa8f511813c4d82c121e199121bb8f3 Reviewed-on: https://gerrit.openafs.org/13597 Tested-by: BuildBot Tested-by: Mark Vitale Reviewed-by: Michael Meffie Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 8d2306e1dae84af9ccbadd2518beaf8543d4413b Author: Andrew Deason Date: Wed May 15 14:35:41 2019 -0500 Add --quiet option to lwptool Add an option to lwptool, called --quiet, to suppress printing the literal commands run. On error, we still print the exact failed command to stderr. For "pretty" V=0 builds, use this new option, to make our lwptool-using compile rules look more like our other compile rules. Change-Id: I3fed6db3205f8de5e275e9b70aba9e1995afd02f Reviewed-on: https://gerrit.openafs.org/13594 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 4b6a4ff31a4197504bbcf2d4c14c24dee672d40e Author: Andrew Deason Date: Thu May 16 20:01:17 2019 -0500 Use the ppc64le_linuxXX sysname for ppc64le builds Commit 191e18eb (Open ppc64le_linux sysname space) added the ppc64le_linux26 sysname, but it still must be manually specified when running on ppc64le. Use the ppc64le_linux26 by default on ppc64le, so we can compile without needing to specify an explicit sysname. Change-Id: I5abbdde06622d5f2b067bfd003f9d4cd51c56f1a Reviewed-on: https://gerrit.openafs.org/13593 Tested-by: BuildBot Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit 46563f929a851032d785634763963808d6e2bfeb Author: Andrew Deason Date: Thu May 16 16:12:47 2019 -0500 Do not define AFS_SYSCALL for ppc64le_linux26 AFS_SYSCALL is defined to the syscall number we can use for a certain platform (for pioctls and other AFS-specific kernel calls). On many modern platforms, such as Linux, we don't use direct syscalls anymore, instead routing our AFS-specific syscalls through an ioctl, and AFS_SYSCALL is just used as a fallback for compatibility for older OpenAFS releases that might still be using the syscall. For new platforms, we have no need for this compatibility code path, since there is no existing code we might need to be compatible with. We should avoid defining AFS_SYSCALL for those, so we can avoid manually-issuing syscalls in more cases. The ppc64le_linux26 platform is a very new platform (introduced in 191e18eb "Open ppc64le_linux sysname space"), and so should not have AFS_SYSCALL defined. So, remove AFS_SYSCALL from ppc64le_linux26's param.h. Change-Id: I7811831b05a17c9428556aca49681cd544da4ff1 Reviewed-on: https://gerrit.openafs.org/13592 Tested-by: BuildBot Reviewed-by: Mark Vitale Tested-by: Andrew Deason Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 191e18ebcee3698a76b55912de0a41111c384128 Author: Nathaniel Filardo Date: Wed May 1 23:01:51 2019 +0100 Open ppc64le_linux sysname space While here, add config/param.ppc64le_linux26.h; it's just like ppc64_linux26.h, except not AFSBIG_ENDIAN. Change-Id: I6671405f829f2bf50b6e8d3355ab9e8aed384c02 Reviewed-on: https://gerrit.openafs.org/13562 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Jeffrey Altman Reviewed-by: Benjamin Kaduk commit 5cd5cd9fa8754a5af346fa6a392363b046316c75 Author: Pat Riehecky Date: Fri Jun 1 16:33:37 2018 -0500 Fix static expressions in conditionals The conditions in these if statements are always true (or always false). Remove the check in cmdebug.c, as it is unnecessary, and fix the check in vlclient.c to actually check for a valid voltype. (via cppcheck) Change-Id: Ica7dfc9b81fe8bd0f156f6e4e616ed45e205985a Reviewed-on: https://gerrit.openafs.org/13158 Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 13817774518ada28f5fe68e0d00ef5dd00b67b55 Author: Cheyenne Wills Date: Thu Apr 18 09:55:09 2019 -0600 redhat: RHEL8 add elfutils-devel as build dependency for kernel module Building the kernel modules under RHEL8 produces the following error message: Makefile:952: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel". Stop. Add elfutils-devel to the BuildRequires in the rpm spec when building rhel >= 8 Add elfutils-devel to the BuildRequires in the rpm spec that openafs-kmodtool produces FIXES 134900 Change-Id: Ie3e03336d9599caa6ceb7879199eab3b12eb971b Reviewed-on: https://gerrit.openafs.org/13560 Tested-by: BuildBot Reviewed-by: Stephan Wiesand Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 9779dd29e7bd76a2b3b759587d6eb919682dfba0 Author: Andrew Deason Date: Thu Nov 9 12:50:53 2017 -0600 asetkey: add 'add-random' command Add a new command, 'add-random', to allow the creation of a new key with random data. This is helpful for certain rxgk keys, which only need to exist in KeyFileExt and not in any other database (like a krb5 KDC), and so aren't derived from a krb5 keytab. Change-Id: I1f3b27e074b0931deb8645f7550e0b315d82e249 Reviewed-on: https://gerrit.openafs.org/12768 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 5120409cc998284f2fb0467c2f88030976140341 Author: Andrew Deason Date: Thu Nov 9 12:47:57 2017 -0600 asetkey: Add new 'delete' command variants The current 'delete' command from asetkey only lets the user delete old-style rxkad keys. Add a couple of new variants to allow specifying the key type and subtype, so the user can delete specific key types and enctypes if they want. Change-Id: If0dfaa70ea0b749dadd52a6b7d62fd3ad2b61d18 Reviewed-on: https://gerrit.openafs.org/12767 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 12b46b6af778625a9c360dca61a59fcf30b76fd1 Author: Andrew Deason Date: Fri Sep 28 14:55:56 2018 -0500 afs: Raise osidnlc NCSIZE The currrent size of the osi DNLC is very small; only 300 entries. Raise it to 4096 entries, to give it some chance of actually helping. In the future, of course, this should be runtime configurable, and we should also raise the hash table size. For now, just raise the number of entries without changing anything else, to try to make sure nothing breaks. With the hash size of 256, this means our hash chains will be at least 16 items long. However, traversing even hundreds of hash items should still be better than frequently hitting the disk cache to find entries, and acquiring more locks, etc. Change-Id: I48f496e8c25fa869ded83e97ff686ed028c923c5 Reviewed-on: https://gerrit.openafs.org/13531 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit e02ae66c7eef1bfc5df9c3e9f2acde3bc3102390 Author: Andrew Deason Date: Mon Apr 1 12:57:42 2019 -0400 doc: Remove one lingering reference to src/mcas Change-Id: I8b137d28d33a805c4aa941cc64a89d6a504fabc6 Reviewed-on: https://gerrit.openafs.org/13539 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 5d0acbbbc0a7bb250886b3040d9e4de05d4fd27f Author: Benjamin Kaduk Date: Tue Aug 1 20:57:52 2017 -0500 Remove src/mcas This lock-free library toolkit is intriguing and may be the subject of future work, but currently nothing uses this code, and these files are just clutter. Remove src/mcas and stop mentioning it in SOURCE-MAP; don't reference it in the rpctests, either. Reviewed-on: https://gerrit.openafs.org/12682 Tested-by: Benjamin Kaduk Reviewed-by: Mark Vitale Reviewed-by: Stephan Wiesand Reviewed-by: Benjamin Kaduk (cherry picked from commit bfc5d1ada2f5ce12bfafe65d352982adbefe9911) Change-Id: I98bec6f0a91e4aad05846a6791719cac63050f02 Reviewed-on: https://gerrit.openafs.org/13538 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 3d22ce36dcb86df564d4d91ff0e174792b30d68f Author: Pat Riehecky Date: Wed Jun 6 10:01:02 2018 -0500 afsmonitor: avoid double free on exit The afsmonitor may leak memory and do a double free on shutdown when it was started with a non-zero -buffers parameter value. The deallocation of the cm results circular buffer incorrectly frees the base of the array of results instead of each result. The fs buffer clean up got this right. This fixes the clang scan-build warning: afsmonitor.c:461:7: warning: Attempt to free released memory free(tmp_cmlist); ^~~~~~~~~~~~~~~~ [mmeffie: update code and commit message] Change-Id: Ifd4ea5b9b865f04e5cf88560dd8a9dfdbe7e32cb Reviewed-on: https://gerrit.openafs.org/13161 Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 1b835d1c1a5d4a838ab1344abc6615626a28b715 Author: Andrew Deason Date: Thu Nov 9 00:03:04 2017 -0600 asetkey: Allow rxgk keys Add rxgk support to asetkey. This just allows asetkey to display rxgk keys more prettily, and allows the user to add literal rxgk key data on the command line, or add keytab-derived keys. Change-Id: Ic28fea628614be2b20276631bc7e7c2f85ccc154 Reviewed-on: https://gerrit.openafs.org/12766 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 5505ccbaf74f7d36cea180a65001d31bbc0abea0 Author: Benjamin Kaduk Date: Sun Apr 13 21:38:02 2014 -0400 auth: Add afsconf_ClientAuthRXGK variants Add various afsconf_ClientAuthRXGK* variants, to use local printed rxgk tokens with clear, auth, or crypt levels. Also add the flag AFSCONF_SECOPTS_RXGK for afsconf_PickClientSecObj, to let callers of afsconf_PickClientSecObj use rxgk connections. To allow selecting of the "clear" level, add the flag AFSCONF_SECOPTS_ALWAYSCLEAR. And to allow selecting the "auth" level but letting "crypt" be the default for rxgk, add the new flag AFSCONF_SECOPTS_NEVERENCRYPT. Change-Id: Ib27f2799eb927ac5aa71eab94212171344dd93df Reviewed-on: https://gerrit.openafs.org/11104 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 0b3bd1b7cdc88ba62c8cd540e8628faa84e33cf9 Author: Andrew Deason Date: Thu Jan 17 00:04:36 2019 -0600 dir: Honor non-ENOENT lookup errors Currently, several places in src/dir/dir.c assume that any error from a lower-level function (e.g. FindItem) means that the item we're looking for does not exist in that directory. But if we encountered some other error, that may not be the case; the directory blob may be corrupt, we may have encountered some I/O error, etc. To detect cases like this, return the actual error code from FindItem &c, instead of always reporting ENOENT. For the code paths that are actually specifically looking for if the target exists (in afs_dir_Create), change our checks to specifically check for ENOENT, and return any other error. Do the same thing for a few similar callers in viced/afsfileprocs.c, as well. FIXES 134904 Change-Id: I41073464b9ef20e4cbb45bcc61a43f70380eb930 Reviewed-on: https://gerrit.openafs.org/13431 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 8b6ae2893b517bd4e008cae94acff70abe4d2227 Author: Andrew Deason Date: Thu Mar 21 15:24:06 2019 -0500 LINUX: Avoid lookup ENOENT on fatal signals Various Linux kernel operations on various Linux kernel versions can fail if the current process has a pending fatal signal (i.e. SIGKILL), including reads and writes to our local disk cache. Depending on what and when something fails because of this, some parts of libafs throw an ENOENT error, which may propagate up to callers, and be returned from afs_lookup(). Notably this can happen via some functions in src/dir/dir.c, and previously was possible with some code paths before they were fixed by commit 2aa4cb04 (afs: Stop abusing ENOENT). For the most part, the exact error given to the userspace caller doesn't matter, since the process will die as soon as we return to userspace. However, for ENOENT errors specifically for lookups, we interpret this to mean that the target filename is known to not exist, and so we create a negative dentry for that name, which is cached. Future lookups for that filename will then result in ENOENT before any AFS functions are called. The lingering abuses of the ENOENT error code should be removed from libafs entirely, but as an extra layer of safety, we can just avoid returning ENOENT from lookups if the current process has a pending fatal signal. So to do that, change all afs_lookup() callers in src/afs/LINUX to translate ENOENT to EINTR if we have a pending fatal signal. If fatal_signal_pending() is not available, then we don't do this translation. FIXES 134904 Change-Id: I00f1516c2aa0f45f1129f5d5a44150b7539c31cc Reviewed-on: https://gerrit.openafs.org/13530 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit b9f0b63792270383b23c6a6462cd5f4590db1975 Author: Andrew Deason Date: Sun Mar 4 17:33:47 2018 -0600 Use rxgk in afsconf_BuildServerSecurityObjects In afsconf_BuildServerSecurityObjects, create a server security object for rxgk. Currently, this will only accept printed rxgk tokens, not tokens negotiated via GSSNegotiate. Future commits will add functionality to handle user-negotiated tokens, fileserver-specific creds, etc. Change-Id: Ie2bbef0d591641e80bb85240316c4ee5f9f8ff05 Reviewed-on: https://gerrit.openafs.org/12941 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 83eec9093c8a3f177268a9164182e8ba3958dbc8 Author: Benjamin Kaduk Date: Wed Mar 26 06:24:02 2014 -0400 Implement afsconf_GetRXGKKey Also afsconf_GetLatestRXGKKey, as a side effect, since we want to have a single getkey function both for getting encrypting and decrypting keys; a kvno/enctype pair of 0/0 indicates that the "get latest" behavior is desired. Implement both functions in terms of an internal helper that takes as an argument the type of key to look for in the KeyFileExt. We can reuse these helpers wholesale for per-fileserver keys, later. This also requires implementing an ordering on the quality of the different RFC 3961 enctypes (which are stored as the subtype of keys of type afsconf_rxgk). This is subject to debate on the actual ordering, but since the IANA enctype registry changes rarely, just assign a full ordering on the standardized (symmetric!) enctypes. Implement this via a new function, rxgk_enctype_better, in rxgk_crypto_rfc3961.c. Introduce a new header file, rxgk_types.h, so we can avoid including the entire rxgk.h header in cellconfig.p.h. Change-Id: I81389b21238fd6588cc4381b026816005f81a30c Reviewed-on: https://gerrit.openafs.org/11099 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 4091b9271b1bfbf27f9d6871aa884df81220861a Author: Ben Kaduk Date: Wed Dec 4 13:03:46 2013 -0500 Add rxgk support to userok Change-Id: I5da2a89532453b6bec61fc87218a61455e39f6f0 Reviewed-on: https://gerrit.openafs.org/10576 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 69e083d4aaf8731049cbedf85ee5ade31277f251 Author: Ben Kaduk Date: Fri Dec 13 18:46:11 2013 -0500 Build rxgk support into libafsrpc Add a dependency on the appropriate $(GSSAPI_LIBS) and link in the librxgk_pic.la helper. Careful control of what functions are exposed allows static linking to continue to work when rxgk is disabled, though a stub is needed for the case of rxgk_GetServerInfo, so that there is a symbol present to satisfy the export symbol list. Consumers of libafsrpc.a need not be modified in accordance with this change. Change-Id: I76c0329ba842fb0d4d66534810b114a0813c90a0 Reviewed-on: https://gerrit.openafs.org/10591 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 20b0f5b4d0b55e79e55442978c297663a5e18b76 Author: Benjamin Kaduk Date: Fri Sep 1 17:45:10 2017 -0500 Add rxgk_GetServerInfo stub Provide a stub function that libafsrpc can export when rxgk support is disabled. (It always returns failure, of course.) Change-Id: Id9f816d25c1a8f56995ec185ae83db0924de0010 Reviewed-on: https://gerrit.openafs.org/12721 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit ce38ed952962b4bbba80a4d3bff1ee1ac01ca4e4 Author: Andrew Deason Date: Fri Mar 2 00:24:54 2018 -0600 rxdebug: Add rxgk support Change-Id: I6ffeb7b36f41816ca1c3d12bb5e8097dd5d7a3fd Reviewed-on: https://gerrit.openafs.org/12940 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 67da564a5b0acd01fe67829fe28ea808e0d278a4 Author: Ben Kaduk Date: Tue Dec 10 00:09:35 2013 -0500 Implement rxgk client security object routines Change-Id: Ic7e11b02cb1573cfdb6d11d4de9a77ab1c563262 Reviewed-on: https://gerrit.openafs.org/10573 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit cda288a2e4ebbd3c915f946a50fa2b59d7ee12b4 Author: Ben Kaduk Date: Mon Dec 9 22:13:16 2013 -0500 Implement the rxgk server security object routines Provide non-trivial implementations of the security class routines used by the server, along with helpers as necessary. The identity supplied in a client's token is given as a list of PrAuthNames; we assume that at most one name is supplied at present, as the meaning of compound identities (and the use of compound identities for keyed cache managers) is not fully specified yet. Convert the PrAuthName to an rx_identity for caching in the server connection state, as the rx_identity type is more compatible with superuser checks on the connection. Also provide an rxgk_GetServerInfo routine which extracts the cached identity, for use in libauth when making superuser checks. This moves our dependency on rx_identity from the private data structures into the public header, so move the nested include accordingly. Change-Id: I0f48b69d4ab758d8a4d76ebfb1daf3009c4fe060 Reviewed-on: https://gerrit.openafs.org/10572 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit ae9b90170ffa02f7b65339b3c138709362f27d69 Author: Andrew Deason Date: Tue Mar 12 17:03:09 2019 -0500 rxgk: Avoid calling xdr_destroy on blank xdrs A couple of callers in rxgk_token.c call xdr_destroy(&xdrs) in a cleanup code path; at present the code is fine because we are careful to only jump to the cleanup path from a state where the xdrs are initialized, but this is needlessly fragile (and is an undocumented requirement of the code). Since xdr_destroy() unconditionally looks at xdrs.x_ops->x_destroy, this could cause a NULL dereference if an error is encountered in a future version where the 'xdrs' may be zeroed when the cleanup path runs. Change-Id: I23c1bd09c88238bc602cc92572df4cd2278c69c9 Reviewed-on: https://gerrit.openafs.org/13521 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit aa6661f653d86d4e792587eefbc37705b68e5137 Author: Andrew Deason Date: Tue Mar 12 18:42:42 2019 -0500 rxgk: Do not require gss_pseudo_random We actually do not yet call gss_pseudo_random anywhere in the rxgk codebase. We will need this later, so print a warning when we don't have it, but let rxgk build so we can build on platforms without gss_pseudo_random for now (Solaris/SEAM). Change-Id: I1cee935a12caad1ac00717f468d7e6661e0817c9 Reviewed-on: https://gerrit.openafs.org/13520 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit de883869d7ac2af6a640f8cf9f3d8c7c37433ce5 Author: Andrew Deason Date: Fri Feb 1 23:25:02 2019 -0600 auth: Make afsconf_PutTypedKeyList idempotent Currently, if we call afsconf_PutTypedKeyList on a key list, we set the key list to NULL. But then if we call afsconf_PutTypedKeyList on a NULL key list, we segfault because we try to dereference the list. Change afsconf_PutTypedKeyList to be a noop if we give it a NULL list, avoiding a segfault in such a situation. Change-Id: I2c1de0c0a05ab036667031eb0e765933917826a6 Reviewed-on: https://gerrit.openafs.org/13507 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 635594d6cceba6de4e09be5a9e9b908f7d16697d Author: Andrew Deason Date: Wed Mar 13 18:30:43 2019 -0500 rx: Do not ignore RXS_* op errors Several places in rx call an RXS_* security layer operation, but ignore the error code. Though errors for these operations are rare or impossible currently, if they ever do return an error there could be noticeable consequences, like a connection getting an uninitialized challenge nonce, or sending a challenge packet with uninitialized payload. Change these call sites to record and handle the error. Errors from the security class normally mean aborting the entire conn, but for many operations we need to behave differently: - For RXS_DestroyConnection, errors don't make sense, since we're just freeing an object. Change the op to return void, and update our implementations of DestroyConnection to match. - For RXS_GetStats, just clear the relevant stats structure on error instead. This change also results in us clearing the stats structure when there is no security class associated with the connection; previously we just reused the same struct data as the previous conn. - For RXS_CreateChallenge, aborting the entire conn is difficult, because some code paths have callers that potentially lock multiple calls on the same conn (rxi_UpdatePeerReach -> TryAttach -> rxi_ChallengeOn -> RXS_CreateChallenge), and aborting our conn requires locking every call on the conn. So instead we just propagate an error up to our callers, and we abort just the call we have. - For RXS_GetChallenge, we cannot abort the conn when rxi_ChallengeEvent is called directly, because the caller will have the call locked. But when rxi_ChallengeEvent is called as an event (when we retry sending the challenge), we can. - For RXS_SetConfiguration, propagate the error up to our caller. Update all rx_SetSecurityConfiguration callers to record and handle the error; all of these are during initialization of daemons, so have them log an error and exit. Change-Id: I138b3e06da00470c7d70c458879cc741d296d225 Reviewed-on: https://gerrit.openafs.org/13522 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 2ee35afa339731f6a60f1e5e99ccaf63baa6c891 Author: Stephan Wiesand Date: Fri Mar 22 12:46:17 2019 +0100 Add param.h files and sysnames for FreeBSD 11.2 Thanks to Måns Nilsson for filing the bug. Note that this change differs from the proposed patch in the report, in that it doesn't define the 10.4 symbols in the 11.2 param.h files. FIXES 134850 Change-Id: I83b3a81609c109eef243533b0e1defa3aca0d526 Reviewed-on: https://gerrit.openafs.org/13534 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot Reviewed-by: Stephan Wiesand commit e7ea4781f07b29f7f0fc0b5ba17303bd68022e54 Author: Karl Behler Date: Fri Mar 22 12:22:05 2019 +0100 man-pages: create the man3 subdirectory in prep-noistall This should fix a build failure reported on the openafs-devel list today. Change-Id: I227922f78aaa614b73dd1f5c1c61116168fc0b69 Reviewed-on: https://gerrit.openafs.org/13533 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 11cc0a3c4e0d76f1650596bd1568f01367ab5be2 Author: Andrew Deason Date: Sat Mar 2 15:58:00 2019 -0600 afs: Cleanup state on rxfs_*Init errors Currently, rxfs_storeInit and rxfs_fetchInit return early if they encounter an error while starting the relevant fetch/store RPC (e.g. StartRXAFS_FetchData64). In this scenario, they osi_FreeSmallSpace their rock before returning, but they never go through their destructor to free the contents of the rock (rxfs_storeDestroy/rxfs_fetchDestroy), leaking any resources inside that have already been initialized. The only thing that could have been initialized by this point is v->call, so hitting this condition means we leak an Rx call, and means we can report the wrong error code (since we never go through rx_EndCall, we never look at the call's abort code). For rxfs_fetchInit, most code paths call rx_EndCall explicitly, except for the code path where StartRXAFS_FetchData64 itself fails. For both fetches and stores, it's difficult to hit this condition, because this requires that the StartRXAFS_* call fails, before we have sent or received any data from the wire. However, this can be hit if the call is already aborted before we use it, which can happen if the underlying connection has already been aborted by a connection abort. Before commit 0835d7c2 ("afs: make sure to call afs_Analyze after afs_Conn"), this was most easily hit by trying to fetch data with a bad security object (for example, with expired credentials). After the first fetch failed due to a connection abort (e.g. RXKADEXPIRED), afs_GetDCache would retry the fetch with the same connection, and StartRXAFS_FetchData64 would fail because the connection and call were already aborted. In this case, we'd leak the Rx call, and we would throw an RXGEN_CC_MARSHAL error (-450), instead of the correct RXKADEXPIRED error. This causes libafs to report that the target server as unreachable, due to the negative error code. With commit 0835d7c2, this doesn't happen because we call afs_Analyze before retrying the fetch, which detects the invalid credentials and forces creating a new connetion object. However, this situation should still be possible if a different call on the same connection triggered a connection-level abort before we called StartRXAFS_FetchData64. To fix this and ensure that we don't leak Rx calls, explicitly call rxfs_storeDestroy/rxfs_fetchDestroy in this error case, before returning from rxfs_storeInit/rxfs_fetchInit. Thanks to yadayada@in.ibm.com for reporting a related issue and providing analysis. Change-Id: I15e02f8c9e620c5861e3dcb03c42510528ce9a60 Reviewed-on: https://gerrit.openafs.org/13510 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 6e5638ac7297701a99ea396dee1df8f56a6a50da Author: Andrew Deason Date: Mon Feb 25 11:35:24 2019 -0600 Remove references to SunOS 4 We already removed support for Solaris versions before Solaris 8, in commit e4c2810f ("Remove support for Solaris pre-8"), but there are still some references to SunOS (meaning SunOS 4) in the tree. This is even older than Solaris (aka SunOS 5), so get rid of these. This commit removes most references to SunOS 4 regarding platform support, and a few comments. This also removes a few comments that were just wrong or nonsensical (e.g. CMAPPED in afs.h is used by other platforms; some comments in platform-specific osi_file.c files referenced SunOS for some reason). Change-Id: I0dd3176c582409176fd898f9c9539fbd833ea789 Reviewed-on: https://gerrit.openafs.org/13506 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 872902dcf99186864cfcaf01ab945123f2506c6c Author: Andrew Deason Date: Wed Mar 6 23:06:16 2019 -0600 rx: Make rxi_Free(NULL, size) a no-op Commit 75233973 (afs: Make afs_osi_Free(NULL) a no-op) intended to make some of our free abstractions behave like the userspace free, so freeing NULL is a no-op. However, that commit still left rxi_Free such that rxi_Free(NULL, size) would decrement the relevant allocation counters. So to make our free abstractions more consistent, just skip all of rxi_Free when the given pointer is NULL. Change-Id: I89047e1846eb3e2932d2a125676fb7ffec8972dc Reviewed-on: https://gerrit.openafs.org/13514 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit df23589d2cc0419d8e74b5f1b824512d95623d2e Author: Ben Kaduk Date: Tue Dec 10 17:47:42 2013 -0500 Add rxgk_util.c A few helper routines for the security class implementation. Change-Id: I395802b6c3b2436df4b00906544fc797f3e12e9b Reviewed-on: https://gerrit.openafs.org/10937 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 23c4c3bc0cea68b8d05517065daea849fadad609 Author: Ben Kaduk Date: Mon Dec 9 23:07:17 2013 -0500 Add rxgk_packet.c Routines to apply and verify encryption and MICs to the data in rx packets. Backend to the rxgk_crypto framework for the actual crypto operations. Change-Id: I724efacf7df1d688c0d61a327fa9ee9c8168d715 Reviewed-on: https://gerrit.openafs.org/10571 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit aa231f105ea92275672941cbc2178d9ca26261e0 Author: Mark Vitale Date: Mon Feb 11 18:08:42 2019 -0500 rxgk: fix typo in make dest rule make dest should create directories in DEST, not DESTDIR. Fix the rule. Change-Id: I355e35cc6902517956935d3d2970836494490e69 Reviewed-on: https://gerrit.openafs.org/13489 Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 6e988a5b3900fe73c314c9960d6fb7753ff98411 Author: Cheyenne Wills Date: Fri Mar 1 08:46:32 2019 -0700 bos: remove smail-notifier smail-notifier is a sample program that is undocumented and has not been well maintained. It produces copious compiler warnings, and would require effort to bring the code up to decent coding practices. The bosserver provides a -notifier feature that can be used for notifications, but that feature does not depend on this sample program. Removed the code, cleaned up the Makefiles and .gitignore. Change-Id: I6bd56559121d12ad007acc571b6653aa934eb97f Reviewed-on: https://gerrit.openafs.org/13509 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit df8534909fdc1fa8417aa788c0fa71c5dbe7eb30 Author: Benjamin Kaduk Date: Sat Feb 2 17:02:08 2019 -0600 scout: band-aid -Wformat-truncation gcc8 gets pretty confused about the bounds on these things (presumably due to our alignment options) and thinks this could potentially be a huge string. Check for truncation to appease the compiler, instead of trying to ensure that the buffer is big enough. Change-Id: I4c1e0e6a5a38ee67845cbb7791b280b965989bc8 Reviewed-on: https://gerrit.openafs.org/13470 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Tested-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 8632f23d6718a3cd621791e82d1cf6ead8690978 Author: Benjamin Kaduk Date: Sat Feb 2 12:49:07 2019 -0600 vol: check snprintf return values in namei_ops gcc8 is more aggressive about parsing format strings and computing bounds on the generated text from functions like snprintf. In this case it seems best to detect cases of truncation and error out, rather than trying to increase stack buffer sizes or switch to asprintf. These paths should be well-behaved since they are local to the fileserver, so this is mostly about appeasing the compiler's -Wformat-truncation checks to allow us to build with --enable-checking. Change-Id: Id3f15e450c0f03143c0cc7e40186d5944a8fa3b4 Reviewed-on: https://gerrit.openafs.org/13463 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 453060c27a5d33d3c27128d169298f9d66d06f1a Author: Benjamin Kaduk Date: Sat Feb 2 19:52:26 2019 -0600 libadmin: appease clang -Wsometimes-uninitialized clang thinks that 'time' can be used uninitialized: bos.c:1472:9: error: variable 'time' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] if (as->parms[TIME].items) { ^~~~~~~~~~~~~~~~~~~~~ bos.c:1478:57: note: uninitialized use occurs here if (!bos_ExecutableRestartTimeSet(bos_server, type, time, &st)) { ^~~~ bos.c:1472:5: note: remove the 'if' if its condition is always true if (as->parms[TIME].items) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~ bos.c:1445:5: note: variable 'time' is declared here bos_RestartTime_t time; ^ but in this command description, the TIME argument is required. Add a never-triggered error exit to appease the compiler when --enable-checking is activated. Change-Id: I38fac64fc5aba071f84f2f9e1b497df22df76f09 Reviewed-on: https://gerrit.openafs.org/13476 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 7c15e6efe62fb3fe1970c56331df09b257abf6d9 Author: Benjamin Kaduk Date: Sat Feb 2 19:48:20 2019 -0600 uss: signed/unsigned char fallout When char is signed, assigning 255 to a variable of type char changes the value, which causes clang to emit a warning and fail the --enable-checking build. Change-Id: Id02e2526a9a9dd6657dee55b9dc22da03d102d8c Reviewed-on: https://gerrit.openafs.org/13475 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit f0a3d477d6109697645cfdcc17617b502349d91b Author: Benjamin Kaduk Date: Sat Feb 2 19:45:31 2019 -0600 rework afs_random() yet again clang 7 notes that ~0 is signed and that left-shifting into the sign bit is undefined behvaior. Use a new construction to clear the low byte of tv_usec with only bitwise operations that are independent of the width of tv_usec and stay within the realm of C's defined behavior. Change-Id: I3e4f0fa4a8b8b72df23ef0c8ad7c4a229ac942f3 Reviewed-on: https://gerrit.openafs.org/13474 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 96c0b88947c7aab605170bdca633d3716051a58e Author: Benjamin Kaduk Date: Sat Feb 2 18:39:53 2019 -0600 Avoid incomplete function type in casts clang complains that these casts contain an incomplete function type (since the function argument is omitted rather than declared to be void). Since we just need the cast to pointer type, let the compiler do it implicitly and pass stock NULL, rather than trying to force a cast to function-pointer type. Change-Id: I7f19f2936fe5425573c68fdd727ea90de02defd7 Reviewed-on: https://gerrit.openafs.org/13473 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 8f03ff3bdd8eb9f4557cdb7054aee9b8ea432160 Author: Benjamin Kaduk Date: Sat Feb 2 17:10:29 2019 -0600 dumpscan: appease gcc8 -Wformat-overflow gcc does not benefit from our external knowledge that tm_year is tightly bounded, and thinks it could still be in the range [-2147481748, 2147483647], which would overflow our string buffer. The function in question does not have error handling in place, so rather than adding some or trying to assert the proper bounds, just use a slightly larger buffer for safety. Change-Id: Iafcba5588b805347ddcc0102969bd0e2a3173dd0 Reviewed-on: https://gerrit.openafs.org/13472 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit dff81f1b78fecc54f5af91f7d728925ffca62d2c Author: Benjamin Kaduk Date: Sat Feb 2 17:09:36 2019 -0600 venus: appease gcc8's -Wformat-string Interestingly, even before this commit, the buffer size was larger than what the kernel would accept. Since the kernel does its own length checking, it's simplest to just allow slightly larger requests here and have them fail later. Change-Id: I9ed636e4ad025240cb27b3cc066a8f2a72959396 Reviewed-on: https://gerrit.openafs.org/13471 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit a89297a066d8689f8fc29a7428cfe3ed6235d010 Author: Benjamin Kaduk Date: Sat Feb 2 15:44:54 2019 -0600 butc: -Wformat-truncation fallout Increase some buffer sizes to appease gcc8. While here, use snprintf instead of plain sprintf(!). Change-Id: I39d29522b92070ce2845ba3d392aaf2d97fc7b6e Reviewed-on: https://gerrit.openafs.org/13468 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 584b0f2b6b4391c0c879352bb1786c0f267666c9 Author: Benjamin Kaduk Date: Sat Feb 2 14:43:04 2019 -0600 vlserver: use large enough buffer for rxinfo string The "[dotted-quad] rxkad:name.inst@cell" construct can be as large as (3*4+3)+7+3*64+2+1 == 217 characters (including trailing NUL); size our buffer accordingly to avoid the risk of truncation. Change-Id: Iee635aa66f5f639dfb0572c559a87b5313c305a9 Reviewed-on: https://gerrit.openafs.org/13466 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 7620bd33487207b348ed7aeba45f8d743132ba84 Author: Benjamin Kaduk Date: Sat Feb 2 14:23:03 2019 -0600 vlserver: fix vlentryread() for old vldb formats When we're using old format compatibility, use OMAXNSERVERS for the array lengths instead of MAXNSERVERS. Otherwise we'll try to copy more data than we've read. Detected by gcc8 as: vlutils.c:183:2: error: ‘memcpy’ forming offset [149, 151] is out of the bounds [0, 148] of object ‘tentry’ with type ‘struct vlentry’ [-Werror=array-bounds] memcpy(nbufp->serverFlags, oep->serverFlags, NMAXNSERVERS); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ vlutils.c:141:26: note: ‘tentry’ declared here struct vlentry *oep, tentry; ^~~~~~ Change-Id: Ie720ca037c5a8bd6aaff5b6d5348161e0175b23b Reviewed-on: https://gerrit.openafs.org/13465 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit d6b88e3bd5219a8dffebc07df23e30f1d16f095f Author: Benjamin Kaduk Date: Sat Feb 2 12:56:26 2019 -0600 vol: avoid -Wformat-truncation issues in vol-salvage.c Make some formerly-64-character buffers VMAXPATHLEN (plus a smidgeon) to give them space to hold the composed paths. Change-Id: I403c822a8b7376d08fb29f0127315ec439a5cf0d Reviewed-on: https://gerrit.openafs.org/13464 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 9a5ba85d1853327d8184287e58a6e03fabaaf23d Author: Benjamin Kaduk Date: Sat Feb 2 15:26:23 2019 -0600 uss: Allocate buffer space for trailing NUL Appease gcc8's -Wformat-truncation engine. Change-Id: I2113770f63357edf0f5ca273daf0c516a72034a8 Reviewed-on: https://gerrit.openafs.org/13467 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit d1c32aed108b8ac013757be26052a82aa96bb52f Author: Ben Kaduk Date: Mon Dec 9 14:35:52 2013 -0500 Add rxgk_token.c Routines for constructing tokens (both regular and printed), extracting and decrypting tokens, and helpers therein. Provide the ability to print a token using a given session key and using a random session key; the former is useful for certain variants of localauth wherein a dummy GSS negotiation is performed with the same identity acting as initiator and acceptor. Include a paranoid sanity-check that only the routines intended to produce printed tokens can produce tokens with a zero-length identities list. Change-Id: I0cde7fd0cdf9a27777523cd502b21bdccef41dcc Reviewed-on: https://gerrit.openafs.org/10567 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 868e6248401756594f7abf985c2741d80d3a8517 Author: Mark Vitale Date: Mon Feb 11 02:54:31 2019 -0500 ptclient: enable pthreaded support ptclient has been essentially disabled for pthreads since the ibm-1.0 release. Remove the conditionals to make a functional pthreaded ptclient. Change-Id: Ib0f60b3ab395827b73e5646b014e28ab09607e0e Reviewed-on: https://gerrit.openafs.org/13500 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit ce0eb0f8b2936310eb1b55629772750103475d9a Author: Michael Meffie Date: Wed Nov 21 07:39:24 2018 -0500 auth: refactor afsconf_Open Move code to check the AFSCONF environment variable and read the .AFSCONF files to separate functions. Rename the internal functions afsconf_OpenInternal and afsconf_CloseInternal to the more aptly named LoadConfig and UnloadConfig in preparation for other changes. Add doxygen comments for these functions. Change-Id: Ie3361036c59c9e6ef99801891fff9fad63840344 Reviewed-on: https://gerrit.openafs.org/13397 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 2aafe7df403e6a848185d15495139c07bced2758 Author: Andrew Deason Date: Wed Aug 9 20:06:05 2017 -0500 SOLARIS: Switch non-embedded vnodes for Solaris 11 Newer updates to Solaris 11 have been including several changes to the vnode struct. Since we embed a vnode in our struct vcache, our kernel module must be recompiled for any such change in order for the openafs client to work at all. To avoid the need for this, switch Solaris to using a non-embedded vnode in our struct vcache. Follow a similar technique as is used in DARWIN and XBSD, where we allocate a vnode in osi_AttachVnode, and free it in afs_FlushVCache. Change-Id: I85fd5d084a13bdea4353b5ad9840fddbc45ce8c0 Reviewed-on: https://gerrit.openafs.org/12696 Reviewed-by: Mark Vitale Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Marcio Brito Barbosa Tested-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit a6499e0b086d964f3fcc65fe4be31edc33015061 Author: Andrew Deason Date: Wed Aug 9 20:06:03 2017 -0500 SOLARIS: Fix vnode/vcache casts A few places were using vnodes and vcaches interchangeably. This is incorrect, since they may not always be the same thing if we stop embedding vnodes directly in vcaches Fix these to properly go through AFSTOV/VTOAFS to convert between vcaches and vnodes. Change-Id: I8a2e42d7b83a5374d2b16b19c47417e7f44d4f27 Reviewed-on: https://gerrit.openafs.org/12695 Reviewed-by: Mark Vitale Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk Tested-by: Mark Vitale commit 9a2b11747ce355d9adc8a5a646c88f8f3d9765ee Author: Andrew Deason Date: Wed Aug 9 20:06:00 2017 -0500 SOLARIS: Accept vnodes in vnode ops Currently, our vnode op callbacks look like this: int gafs_fsync(struct vcache *avc, afs_ucred_t *acred); And a pointer to gafs_fsync is given directly to Solaris. This cannot be correct, since 'struct vcache' is an OpenAFS type, so Solaris cannot possibly give us a 'struct vcache'. The actual correct signature for such a function is something like this: int gafs_fsync(struct vnode *vp, afs_ucred_t *acred); And then the 'gafs_fsync' function is supposed to translate 'vp' into a vcache. This works on Solaris right now because we embed the vnode as the first member in our vcache, and so a pointer to a vnode is also a pointer to a vcache. However, this would break if we ever change Solaris vcaches to use a non-embedded vnode (like on some other platforms). And even now, this causes a lot of warnings in osi_vnodeops.c, since the function signatures are wrong for our vnode callbacks. So to fix this, change all of these functions to accept a 'struct vnode', and translate to/from vnodes and vcaches appropriately. Change-Id: Ic1c4bfdb7675037d947273ed987cacd05eddfc92 Reviewed-on: https://gerrit.openafs.org/12694 Reviewed-by: Mark Vitale Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk Tested-by: Mark Vitale commit 41a22dbf719629e0977fa963b3d19c6594d0d729 Author: Andrew Deason Date: Wed Aug 9 20:05:56 2017 -0500 SOLARIS: Reorder definitions for vnode callbacks Currently, many of the functions for our vnode ops are forward-declared, right before they are referenced in the relevant vnop template array. Move the function definitions to before the references, so we can simply get rid of the forward declarations. These functions are also all only referenced in this file, so declare them 'static'. Change-Id: Icd82b6d6176342e2576ce333b40c4b79e8c692c1 Reviewed-on: https://gerrit.openafs.org/12693 Reviewed-by: Mark Vitale Reviewed-by: Michael Meffie Reviewed-by: Marcio Brito Barbosa Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk Tested-by: Mark Vitale commit aa46af6ae35e4f026a8ed94012c3bc18c954de23 Author: Andrew Deason Date: Wed Aug 9 20:05:50 2017 -0500 SOLARIS: Clean up some osi_vnodeops func defs Currently, the Solaris osi_vnodeops.c file forward-declares many of its function definitions, but doesn't declare the arguments. For example: int afs_nfsrdwr(); This avoids type-checking for a few functions that are called before they are defined in this file. Furthermore, many of these functions are only used within this file, but are not declared 'static'. To fix this weirdness, remove most of the forward declarations (most are not referenced until the function is defined), and fully declare the rest. Declare functions 'static' that are not referenced outside of this file. This commit only changes functions up to the 'afs_getsecattr' definition. The rest of the file will be fixed in a future commit. Change-Id: I3f58b9ad8e9c3ea8b3fe3dffacd5118eee0a7ff2 Reviewed-on: https://gerrit.openafs.org/12692 Reviewed-by: Mark Vitale Reviewed-by: Michael Meffie Reviewed-by: Marcio Brito Barbosa Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk Tested-by: Mark Vitale commit d0a2889098526aa148d99e042aa8c3f7855565f7 Author: Mark Vitale Date: Wed Feb 6 16:55:03 2019 -0500 auth: remove stale "magic number" comment A comment in GenericAuth() refers to a "magic number" which used to be present as: *aindex = 2; Commit d5622d03196762bd8a60404fea98b4bb044e076d made this a proper enum: *aindex = RX_SECIDX_KAD; Update the comment to remove mention of a "magic number". No functional change is incurred by this commit. Change-Id: I1d4770211fe4f88822426a9fe19db77bbb0d7738 Reviewed-on: https://gerrit.openafs.org/13490 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 297c479989efb6bd9d4011a43d6c0dc92596761b Author: Pat Riehecky Date: Fri Sep 21 10:05:24 2018 -0500 cmd: bail if out of memory while printing syntax Bail with an error message to stderr if we are unable to format the command syntax due to a string allocation error. Found via scan-build. [mmeffie: updated commit] Change-Id: Ib3bc7f53c295d8dde6c07b9c4990cd1b3bcee58c Reviewed-on: https://gerrit.openafs.org/13335 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 63f015d05293cd853dbd44e5115e6b378644dfb6 Author: Andrew Deason Date: Wed Jan 16 23:44:58 2019 -0600 LINUX: Propagate afs_linux_readdir BlobScan errors In afs_linux_readdir, if we detect an error code from BlobScan, currently we 'break' out of the current while() loop. But right after this loop, we reset 'code' to 0, ignoring the error we just got from BlobScan, and acting like we just reached the end of the directory. This means that if BlobScan could not process the given directory at all, we'll just fail to iterate through some of the entries in the given directory, and not report an error. To fix this, process errors from BlobScan like we do for afs_dir_GetVerifiedBlob, and return an error code and log a message about the corrupted dir. Change-Id: I8bd628624ffc04fc55fd6a0820c73018bd9e4a18 Reviewed-on: https://gerrit.openafs.org/13430 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 2e556c0f23ae439c804352cf51fcf30878b03c7a Author: Andrew Deason Date: Sat Nov 3 01:04:43 2018 -0500 ptserver: Check for -restricted in SPR_Delete Currently, all prdb write operations, except for SPR_Delete, will fail with PRPERM if called by a non-system:administrators caller while restricted mode is active. SPR_Delete is missing this check, and so is not affected by the -restricted option. Fix this by inserting the same check for -restricted as all other code paths that check for -restricted. Change-Id: I35f19d0b715423cd91769e6de845efa330368e50 Reviewed-on: https://gerrit.openafs.org/13374 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit bfe912ede6f452d10cfbd5fd549f44ee027acb1b Author: Benjamin Kaduk Date: Sat Feb 2 12:25:35 2019 -0600 vol: fix vutil format-truncation nit We need one more byte for the trailing NUL. Change-Id: I1379e958e3b5ec92802060c4541f419599e49311 Reviewed-on: https://gerrit.openafs.org/13462 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 3a8fa4ecd65d5d743fdc573c9f0f261aee2063b6 Author: Andrew Deason Date: Sat Nov 3 00:58:58 2018 -0500 ptserver: Fix AccessOK -restricted for SYSADMINID According to the documentation, as well as other code paths that check for -restricted, the -restricted option does not affect members of system:administrators. Currently, though, AccessOK only bypasses the -restricted check if the caller is SYSADMINID itself (i.e. localauth). Fix AccessOK to only do the -restricted checks if the caller is not in system:administrators, to match the documentation as well as other ptserver operations. Change-Id: I3074d4537845f1f4deb7f4b72cdb819391b617e3 Reviewed-on: https://gerrit.openafs.org/13373 Tested-by: BuildBot Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit dfc78d533ef64c8d6daf134e2a0f67c5c16f7369 Author: Andrew Deason Date: Tue Oct 30 14:29:24 2018 -0500 ptserver: Fix AccessOK -restricted for addToGroup The function AccessOK is used by all of ptserver RPC handlers that need to do an authorization check, and the last two arguments are set as such: - When adding a member to a group, 'mem' is PRP_ADD_MEM and 'any' is PRP_ADD_ANY - When removing a member from a group, 'mem' is PRP_REMOVE_MEM and 'any' is 0 - When modifying an entry (setFieldsEntry) or modifying some global database fields, 'mem' and 'any' are both set to 0 - When reading an entry and not modifying it, 'mem' and/or 'any' are set to other values (depending on if we're checking membership, examining the entry itself, etc) Commit 93ece98c (ptserver-restricted-mode-20050415) added a check to AccessOK to make it return false for -restricted mode when we are adding a member to a group, or when 'mem' and 'any' are both 0. This didn't catch the case when we are removing a member from a group, though, when 'mem' is PRP_REMOVE_MEM. It looks like commit a614a8d9 (ptutils-restricted-accessok-20081025) tried to fix this by adding a check for PRP_REMOVE_MEM, but it also required 'any' to be set to 0 for the conditional to succeed. This is true when removing a member from a group, but when adding a member to a group, 'any' is PRP_ADD_ANY, and so this check fails. This means that currently, when restricted mode is turned on, non-admins can still run addToGroup and setFieldsEntry successfully. Fix this by checking for PRP_ADD_MEM/PRP_REMOVE_MEM separately from checking if 'mem'/'any' are set to 0. Break up this conditional into separate if() statements with comments to try to make the checks more clear. Change-Id: I7e647865b772c42e70014f48ce9cd53ef511cd5b Reviewed-on: https://gerrit.openafs.org/13370 Tested-by: BuildBot Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit 10f2c469f45eece0e12573388ae66e392e2dff1c Author: Cheyenne Wills Date: Fri Jan 25 17:35:51 2019 -0700 Redhat: 'clean build area' error message during dkms build/install dkms invokes a make clean command before and after building the kernel module. The make clean that is issued at the start of building results in a nuisance error message because the Makefile doesn't yet exist Building module: cleaning build area...(bad exit status: 2) In the dkms.conf file, built from within the openafs.spec, change the command defined in the CLEAN statement to test for the existence of the Makefile prior to running the actual make clean Change-Id: Ifc0d5eed6ef0cbc3ddfd193d27bbcb8a7cf52f2a Reviewed-on: https://gerrit.openafs.org/13460 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 26b1dc036719a588a5cadecb14053bd4079c1f48 Author: Andrew Deason Date: Fri Feb 1 16:31:50 2019 -0600 Avoid calling krb5_free_context(NULL) Several places in the code currently call krb5_free_context(ctx) in a cleanup code path, where 'ctx' may or may not be NULL. This is not guaranteed to be okay, so check for NULL to make sure we don't cause issues in these code paths. While we are here cleaning up krb5_free_context() calls, also fix a few call sites in afscp_util.c that were not calling krb5_free_context in all error paths. Change-Id: I881f01bdf94f00079f84c4bd4bcfa58998e51ac9 Reviewed-on: https://gerrit.openafs.org/13461 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 86d04ea70fd2e99606b1d1b5b68d980d92e7a3cd Author: Andrew Deason Date: Wed Jan 16 23:46:34 2019 -0600 afs: Throw EIO in DRead on empty dir blob DRead currently returns ENOENT if we try to read a page beyond the end of the given dir blob. We do this to indicate we've hit EOF, but we do this even if the dir blob is completely empty (which is not a valid dir blob). If a dir blob in the cache is truncated due to cache corruption issues, that means we'll indicate a normal EOF condition in that directory for most code paths. If someone is trying to list the directory's entries, for instance, we'll just return that there are no entries in the dir, even though the dir itself is just invalid. To avoid this for at least some cases, return an EIO error instead if the dir blob is completely empty. Change-Id: I8544e125ad12632523d7c514fe63ff9d87e1cd8f Reviewed-on: https://gerrit.openafs.org/13429 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 1a0e5e867107b3f849c17f30976831b5bf5a0e94 Author: Andrew Deason Date: Thu Jan 31 15:44:38 2019 -0600 volser: Remove unused VolRestore flags args SAFSVolRestore has a 'flags' argument, which the volserver passes on to various internal functions, but the value of the flags never actually changes any behavior. Remove the 'aflags' argument (and the derived 'incremental' arg) from a few of our internal functions. The relevant arguments have been unused since OpenAFS 1.0. Change-Id: Ib6ba3d5d9aa3e29d720921cb32fe45c871cd803e Reviewed-on: https://gerrit.openafs.org/13458 Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 1637e0a220157f9e4eb82de82e7372216f95af4e Author: Michael Meffie Date: Tue Jan 29 11:22:41 2019 -0500 xstat: remove unused variable Fix unused variable warning for unused variable oneShotCode. Change-Id: I8c2a5e8bf0cfc2570985b17d8e250403d459e50a Reviewed-on: https://gerrit.openafs.org/13455 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Reviewed-by: Michael Meffie commit 2c6d979be68ee95c9928b91f328b03070342173e Author: Michael Meffie Date: Tue Jan 29 11:20:52 2019 -0500 scout: fix missing softsig header Fix implicit declaration of function opr_softsig_Init() in scout. Change-Id: I2bb9eb5240b053b2f16ef1f37035b01dbc42fb84 Reviewed-on: https://gerrit.openafs.org/13454 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot Reviewed-by: Michael Meffie Tested-by: Michael Meffie commit c37cdbeab4e4675e71b7764994cd7e68ac46c111 Author: Michael Meffie Date: Tue Jun 12 11:37:01 2018 -0400 viced: use calloc in SRXAFS_GetXStats The file server stats are maintained in global static structures, which are zero-ed on program start. The full contents are memcpy-ed to allocated buffers as rx output arguments, so no uninitialized data is sent over the wire. However, this commit converts the output buffer allocation from malloc to calloc to make this more clear from code inspection and make the code more robust. While here, clean up the comments in SRXAFS_GetXStats and remove the commented out code for a collection type which was never implemented. Remove the comments about overwriting spare xstat values, which seems to be a remnant from an early version of the code. For informational purposes, add a note at the top of SRXAFS_GetXStats to make it clear the CallPremable() is intentionally avoided in this implementation of the GetXStats RPC. Apparently, the CallPremable() is omitted since the OpenAFS file server does not to send callbacks to clients issuing only GetXStats RPCs, and so also avoids sending TMAY requests to clients like xstat_fs_test. Note that the presumably older GetStatistics and GetStatistics64 do unfortunately invoke CallPreamble(), so programs such as scout, must be able receive RXAFSCB RPCs from OpenAFS file servers. Change-Id: I7b90c7c6c561c74961fb7f7694a9576e1bed44d6 Reviewed-on: https://gerrit.openafs.org/13204 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 6b67cac432043a43d7cdfa6af972ab54412aff94 Author: Michael Meffie Date: Tue Oct 17 16:39:50 2017 -0400 convert xstat and friends to pthreads Convert the xstat, fsprobe, and gtx libraries and test programs to pthreads. Build these libraries with libtool. Build the scout and afsmonitor programs with pthreads instead of LWP. Change-Id: Ie1737e71b4e57735bf7b6c7dc3177d717ea35ac6 Reviewed-on: https://gerrit.openafs.org/12753 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 6575af97f4baf1728882ebe8f4ce474334f52ea5 Author: Michael Meffie Date: Thu Nov 15 16:19:51 2018 -0500 auth: fix afsconf_GetExtendedCellInfo memory leak Commit c4a127d0578e521b97131c5dedf9da58f71b0242 (ubik-clone-support-20010212) added changes to support ubik clone sites. This commit added the afsconf_GetExtendedCellInfo function, which returns the info given by the original afsconf_GetCellInfo, plus an array of booleans (as chars) to indicate which cell servers are ubik clones. Unfortunately, the afsconf_GetExtendedCellInfo function calls the afsconf_OpenInternal function on an already opened configuration. It does so to look for server entries which are marked as clone sites in the CellServDB file. Opening the already opened configuration leaks at least the cellName and local realms information, and is generally confusing. Instead, remember which sites are designated as clone sites when the CellServDB is read when the configuration is opened, and return that info to the callers of afsconf_GetExtendedCellInfo. This commit adds the clone array to the afsconf_cell structure and changes to afsconf_GetCellInfo() for this new server-related data. As part of this change, remove the no longer needed cell and clones arguments to the internal function afsconf_OpenInternal, which were added by commit c4a127d0578e521b97131c5dedf9da58f71b0242. Update the testcellconfig test program to output the new afsconf_cell clone member. This leak was found with valgrind. Change-Id: I73db60b6a4a77e620e0511ca45cc3418503278a4 Reviewed-on: https://gerrit.openafs.org/13396 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Reviewed-by: Mark Vitale Tested-by: BuildBot commit 80ed9d98779135d43f23c9e51e7bd6bce36405f1 Author: Michael Meffie Date: Fri Nov 16 10:00:17 2018 -0500 auth: plug auth realms memory leaks The function _afsconf_FreeRealms, called by afsconf_CloseInternal, leaks two afsconf_realms structures. The function _afsconf_LoadRealms also leaks those two structures when it fails. These memory leaks were discovered with valgrind. Change-Id: I1436ce21609951bc3433b6c91221cc45e78881bc Reviewed-on: https://gerrit.openafs.org/13395 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Reviewed-by: Mark Vitale commit 93b26c6f55245e2187e574eb928f5e0ce66a245e Author: Michael Meffie Date: Fri Dec 7 20:29:03 2018 -0500 Add the CellServDB pathname to the afsconf_dir The determination of the CellServDB pathname is platform-dependent. However, error reporting in the current code base assumes the CellServDB location is platform-independent. Add the pathname of the CellServDB file to the configuration directory structure and set the new cellservDB member when opening the configuration. Use this value when checking if the CellServDB has changed and update the callers to use the cellservDB member when reporting errors about the CellServDB file. Change-Id: I5a3393fb9d4ae3c637d5a0d773598115314bfe1c Reviewed-on: https://gerrit.openafs.org/13408 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit ce327b568f4ff522aa008f235d97e0d9144eb92c Author: Andrew Deason Date: Thu Jan 17 00:12:06 2019 -0600 afs: Do not ignore errors in afs_CacheFetchProc afs_CacheFetchProc currently has a section of code that looks like this pseudocode: if (!code) do { while (length > 0) { code = read_from_rx(); if (code) { break; } code = write_to_cache(); if (code) { break; } } code = 0; } while (moredata); return code; When we encounter an error when reading from rx or writing to the cache, we break out of the current loop to stop processing and return an error. But there are _two_ loops in this section of the code, so what we actually do is break out of the inner loop, set 'code' to 0, and then usually return (since 'moredata' is usually never set). This means that when we encounter an unexpected error either from the net or disk (or the memcache layer), we ignore the error and return success. This means that we'll store a subset of the relevant chunk's data to disk, and flag that chunk as complete and valid for the relevant DV. If the error occurred before we wrote anything to disk, this means we'll store an empty chunk and flag it as valid. The chunk will be flagged as valid forever, serving invalid data, until the cache chunk is evicted or manually kicked out. This can result in files and directories appearing blank or truncated to applications until the bad chunk is removed. Possibly the most common way to encounter this issue is when using a disk cache, and the underlying disk partition is full, resulting in an unexpected ENOSPC error. Theoretically this can be seen from an unexpected error from Rx, but we would have to see a short read from Rx without the Rx call being aborted. If the call was aborted, we'd get an error from the call to rx_EndCall() later on. To fix this, change all of these 'break's into 'goto done's, to be more explicit about where we are jumping to. Convert all of the 'break's in this function in the same way, to make the code flow more consistent and easier to follow. Remove the 'if () do' on a single line, since it makes it a little harder to see from a casual glance that there are two nested loops here. This problem appears to have been introduced in commit 61ae8792 (Unite CacheFetchProcs and add abstraction calls), included in OpenAFS 1.5.62. Change-Id: Ib965a526604e629dc5401fa0f1e335ce61b31b30 Reviewed-on: https://gerrit.openafs.org/13428 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit 21ad6a0c826c150c4227ece50554101641ab4626 Author: Cheyenne Wills Date: Fri Jan 18 17:22:44 2019 -0700 Linux_5.0: replaced current_kernel_time with ktime_get_coarse_real_ts64 In Kernel commit fb7fcc96a86cfaef0f6dcc0665516aa68611e736 the current_kernel_time/current_kernel_time64 functions where renamed and the calling was standardized. According to the Linux Documentation/core-api/timekeeping.rst ktime_get_coarse_real_ts64 is the direct replacement for current_kernel_time64. Because of year 2038 issues, there is no replacement for current_kernel_time. Updated code that used current_kernel_time to use new name and calling convention. Updated autoconf test that sets IATTR_TAKES_64BIT_TIME as well. Change-Id: I607bdcf6f023425975e5bb747e0e780b3d2a7ce5 Reviewed-on: https://gerrit.openafs.org/13434 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit b892fb127815bdf72103ae41ee70aadd87931b0c Author: Cheyenne Wills Date: Fri Jan 18 16:53:58 2019 -0700 Linux_5.0: replace do_gettimeofday with ktime_get_real_ts64 In Kernel commit e4b92b108c6cd6b311e4b6e85d6a87a34599a6e3 the do_gettimeofday function was removed. According to the Linux Documentation/core-api/timekeeping.rst ktime_get_real_ts64 is the direct replacement for do_gettimeofday Updated the macro osi_GetTime to use ktime_get_real_ts64 if it is available. Change-Id: I7fcd49958de83a6a040e40bd310a228247c481b2 Reviewed-on: https://gerrit.openafs.org/13433 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 10b02075a262dbe802266ea4bcac3936dff5dd23 Author: Mark Vitale Date: Fri Jan 18 17:05:49 2019 -0500 LINUX: correct include for ktime_get_coarse_real_ts64() The include for the ktime_get_coarse_real_ts64() autoconf test is incorrect; ktime_get_coarse_real_ts64() has always been in linux/ktime.h (via #include timekeeping.h), not linux/time.h. This autoconf test still ran correctly because the OpenAFS build was inadvertently picking up ktime.h via the default autoconf include path. Therefore, this commit is needed only to provide documentation and clarity to future maintainers. Introduced as a cut-n-paste error (from the current_kernel_time test) with commit 3c454b39d04f4886536267c211171dae30dc0344 for Linux 4.20. Change-Id: I994b03a1700330756216c7feab0121c82d0f3ee4 Reviewed-on: https://gerrit.openafs.org/13437 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 3969bbca6017eb0ce6e1c3099b135f210403f661 Author: Cheyenne Wills Date: Thu Jan 17 16:00:37 2019 -0700 Linux_5.0: Use super_block flags instead of Mount flags when filling sb In Kernel commit e262e32d6bde0f77fb0c95d977482fc872c51996 the mount flags (MS_) were moved from uapi/linux/fs.h to uapi/linux/mount.h. This caused a compile failure in src/afs/LINUX/osi_vfsops.c The Linux documentation in uapi/linux/mount.h indicates that the MS_ (mount) flags should only be used when calling sys_mount and filesystems should use the SB_ (super_block) equivalent. src/afs/LINUX/osi_vfsops.c utilized the mount flag MS_NOATIME while filling the super_block. Changed to use SB_NOATIME (which has the same numeric value as MS_NOATIME) if available. Change-Id: I2b2199de566fbadd45e857b37d24ce63002c7736 Reviewed-on: https://gerrit.openafs.org/13432 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 892045a9803ed471986569705d9d727165ca7ecf Author: Marcio Barbosa Date: Sat Aug 11 13:17:28 2018 -0400 vol: remove empty directories left by vos zap -force The vos zap -force command does not remove the directories associated with the volume in question (AFS_NAMEI_ENV). When the vos zap -force command is executed, the volume server goes through the /vicep*/AFSIDat directories and removes the files associated with the volume id received as an argument. Unfortunately, the volume server does not remove the directories associated with this volume. As a result, empty directories are left behind. To fix this problem, remove the empty directories left behind when vos zap -force is executed. Change-Id: I56fd52918223f87e424121bac6a086d7b0a46284 Reviewed-on: https://gerrit.openafs.org/12879 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Michael Meffie Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 345a739b7bb6c9c142a2b0fe584fed6c44d6c655 Author: Andrew Deason Date: Tue Nov 13 11:09:52 2018 -0600 roken: Use srcdir for roken-post.h roken-post.h is a source file, not a generated file in the objdir. Specify $(srcdir) so we can work with objdir builds. Change-Id: I1d00ba1f28bea99770c2af56890fbf22ee764820 Reviewed-on: https://gerrit.openafs.org/13387 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit a28f9d28aef18936eb0ea02491ce64c72eeb1fe9 Author: Cheyenne Wills Date: Wed Nov 28 15:45:20 2018 -0700 Redhat: correct path to kernel module in dkms.config This fix corrects some annoying error and warning messages during dkms install or uninstall. Install: DKMS: build completed. openafs: Running module version sanity check. ERROR: modinfo: could not open /lib/modules/2.6.32-754.6.3.el6.x 86_64/weak-updates/openafs.ko: No such file or directory - Original module - No original module exists within this kernel - Installation - Installing to /lib/modules/2.6.32-754.6.3.el6.x86_64/extra/ Adding any weak-modules WARNING: Can't read module /lib/modules/2.6.32-754.6.3.el6.x86_6 4/weak-updates/openafs.ko: No such file or directory egrep: /lib/modules/2.6.32-754.6.3.el6.x86_64//weak-updates/open afs.ko: No such file or directory Remove Status: Before uninstall, this module version was ACTIVE on this kernel. Removing any linked weak-modules rmdir: failed to remove `.': Invalid argument WARNING: Can't read module /lib/modules/2.6.32-754.6.3.el6.x86_6 4/weak-updates/openafs.ko: No such file or directory egrep: /lib/modules/2.6.32-754.6.3.el6.x86_64//weak-updates/open afs.ko: No such file or directory openafs.ko: - Uninstallation - Deleting from:/lib/modules/2.6.32-754.6.3.el6.x86_64/extra/ - Original module - No original module was found for this module on this kernel - Use the dkms install command to reinstall any previous module version. Background: Commit 1c96127e37c0ec41c7a30ea3e4aa68f3cc8a24f6 standardized the location where the openafs.ko module is installed (from /kernel/3rdparty to /extra/). The RPM Spec file was not updated to build the dkms.conf file with the corrected location. From the documentation for dkms DEST_MODULE_LOCATION is ignored on Fedora Core 6 and higher, Red Hat Enterprise Linux 5 and higher, Novell SuSE Linux Enterprise Server 10 and higher, Novell SuSE Linux 10.0 and higher, and Ubuntu. Instead, the proper distribution-specific directory is used. However the DEST_MODULE_LOCATION is still used saving and restoring old copies of the module. The NO_WEAK_MODULES parameter prevents dkms from creating a symlink into weak-updates directory, which can lead to broken symlinks when dkms-openafs is removed. The weak modules facility was designed to eliminate the need to rebuild kernel modules when kernel upgrades occur and relies on the symbols within the kABI. Openafs uses symbols that are outside the kABI, and therefor is not a candidate for a weak module. Change-Id: I52a332036056a359a57a3ab34d56781c896a2eea Reviewed-on: https://gerrit.openafs.org/13404 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk commit 0bd55a02bb5707b1b8b26347d5cb6ad71765f622 Author: Michael Meffie Date: Thu Dec 27 09:32:35 2018 -0500 build: declare test targets as phony Modern versions `make` will not build the 'test' target since a directory exists with the same name. $ grep -C1 '^test:' Makefile test: cd test; $(MAKE) $ make test make: 'test' is up to date. Declare these targets as .PHONY to force make to build the test programs even when the 'test' directory is present. Also use '&&' to concatenate commands instead ';' to avoid running the second command when the first fails. Change-Id: Id561d7610f80b87b59c632801fa0a4b216feb42d Reviewed-on: https://gerrit.openafs.org/13419 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit f6182922455aa0cbee19d138b0827eb87dc2b7ce Author: Andrew Deason Date: Mon Jan 14 17:12:27 2019 -0600 lwp: Avoid freeing 'stackmemory' on AIX32 Commit 55013a11 (lwp: Fix possible memory leak from scan-build) added some free() calls to some otherwise-leaked memory. However, one of these calls frees the 'stackmemory' pointer, which on AIX32 is not a pointer from malloc/calloc, but calculated from reserveFromStack(). To avoid corrupting the heap, skip this free call on AIX32. This commit adds another #ifdef to avoid this, which is unfortunate, but this is also how the free is avoided in the existing code for Free_PCB(). Change-Id: I6c4518f810e56c362ee744f250747fe8fc765b13 Reviewed-on: https://gerrit.openafs.org/13426 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit d0dbd0f12119f0e874ba30adec81061ac6ae27c7 Author: Mark Vitale Date: Fri Oct 5 10:39:23 2018 -0400 rx: remove rx_atomic bitops The rx_atomic bitops were introduced with commit 1839cdbe268f4b19ac8e81ae78548f5c78e0c641 ("rx: atomic bit ops"). The last (only) reference to them was recently removed with commit 5ced6025b9f11fadbdf2e092bf40cc87499ed277 ("rx: Convert rxinit_status to rx_IsRunning()"). Remove the now unreferenced bitops. This commit is comprised of partial or complete reverts of the following commits: ae4ad509d35 rx: fix rx_atomic warnings under Solaris (partial) c16423ec4e6 rx: fix atomics on darwin (partial) 9dc6dd9858a rx: Fix AIX test_and_set_bit (complete) 1839cdbe268 rx: atomic bit ops (complete) Note: The rx_atomic bitops for Linux systems are known to be broken due to incorrect casting of rx_atomic_t into the unsigned long operand expected by the native Linux bitops. The failure modes include silent overruns on little-endian and incorrect results on big-endian. Do not merely revert this commit in order to bring these bitops back into the tree. Change-Id: I6b63519f63d370ccc8df816b4388487909c17dcd Reviewed-on: https://gerrit.openafs.org/13390 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit b2475c11f4d430402a82cb5b018dbccdaa0dccd8 Author: Andrew Deason Date: Thu Dec 20 14:29:47 2018 -0600 rx: Statically check rx_statisticsAtomic size Currently, rx_GetStatistics assumes that struct rx_statistics and rx_statisticsAtomic have the same size (we just memcpy between them). However, this is never checked, and rx_statistics contains many 'int' fields where rx_statisticsAtomic has rx_atomic_t fields. If these are not the same size, our rx stats will silently break, so add a static assert to make sure they are the same size. Change-Id: I889867f4a85530c30dd15d32d1822144ea128a95 Reviewed-on: https://gerrit.openafs.org/13414 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit fa3ce81178b23ee2d96f4e496484c23ed0ce7bfc Author: Andrew Deason Date: Thu Dec 20 14:37:31 2018 -0600 Revert "rx: fix rx_atomic warnings under Solaris" This reverts commit ae4ad509d35aab73936a1999410bd80bcd711393. While that commit did fix the mentioned warnings on Solaris, it also changed the size of rx_atomic_t. Our code in rx_stats.c assumes that an rx_atomic_t is 4-bytes wide, and so changing the size of rx_atomic_t broke our reporting for stats in the 'rx_stats' structure. To fix this, revert that commit. This reintroduces the mentioned warnings, but those warnings are reported for our atomic bit-op functions, which are unused and will be removed by another commit. Change-Id: Ie3e72cc06690d9f8de79e8f0274ea51079004c38 Reviewed-on: https://gerrit.openafs.org/13415 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 67c406e57a9a4409b3da811546660ac596888b2f Author: Michael Meffie Date: Thu Nov 15 13:49:21 2018 -0500 auth: update the auth test programs Fix build errors for the auth test programs. Close the configuration directory before exiting the testcellconf program so we can check for leaks. Add a call to afsconf_GetExtendedCellInfo to the testcellconf test program. Use libcmd to parse the testcellconf command line options. Add the -reload option to testcellconf to perform an optional reload test. The user must have file permissions to touch the CellServDB to perform the reload test. Change-Id: I1cb4cacf9a15ccf7066fb32bfe5f5d03ef64bfd7 Reviewed-on: https://gerrit.openafs.org/13394 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit d6f52d11c358f71b2c4357cb135e898de7c6277b Author: Mark Vitale Date: Mon Oct 29 16:48:14 2018 -0400 afs: avoid afs_GetDownDSlot panic on afs_WriteDCache failure If afs_GetDownDSlot() finds insuffcient free slots in the afs_freeDSList, it will walk the afs_DLRU attempting to flush and free eligible dcaches. However, if an error occurs during the flush to CacheItems (afs_WriteDCache()), e.g., -EINTR, afs_GetDownDSlot() will assert. However, a panic in this case is overkill, since afs_GetDownDSlot() is a best-effort attempt to free dslots. The caller (afs_UFSGetDSlot()) will allocate more dcaches if needed. Instead: - Refactor afs_GetDownDSlot() by moving the QRemove() call to after the afs_WriteDCache logic, so it accompanies the logic that puts the dcache back on the freelist. This is safe because we hold the afs_xdcache W lock for the duration of the routine. - If afs_WriteDCache() returns an error, return early and let the caller handle any recovery. Change-Id: Ifd0d56120095c9792998ff935776bbd339a76c8a Reviewed-on: https://gerrit.openafs.org/13364 Reviewed-by: Andrew Deason Tested-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk commit 59d3a8b86da648e3c5b9774183c6c8571a36f0c4 Author: Mark Vitale Date: Fri Nov 30 12:10:50 2018 -0500 vos: restore status information to 'vos status' Commit d3eaa39da3693bba708fa2fa951568009e929550 'rx: Make the rx_call structure private' created accessors for several rx_call members. However, it simply #ifdef'd out the packet counters and timestamps reported by 'vos status' (AFSVol_Monitor). This is a regression for the 1.8.x 'vos status' command. Instead, supply an accessor so 'vos status' can again be used to monitor the progress of certain volume operations. FIXES 134856 Change-Id: I91f5831b21f128bd8e86db63387a454c9e57bcdf Reviewed-on: https://gerrit.openafs.org/13400 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot Reviewed-by: Mark Vitale commit d9d9571785dabc5c311111b1263fe0881b0ccda5 Author: Andrew Deason Date: Thu Dec 13 12:25:32 2018 -0600 afs: Reword "cache is full" messages Currently, there are multiple different areas in the code that log a message that look like this, when we encounter an ENOSPC error when writing to the cache: *** Cache partition is FULL - Decrease cachesize!!! *** The message is a bit unclear, and doesn't even mention AFS at all. Reword the message to try to explain a little more what's happening. Also, since we log the same message in several different places, move them all to a common function, called afs_WarnENOSPC, so we only need to change the message in one place. Change-Id: If1c259bd22a382ff56ed29326aa20c86389d06bc Reviewed-on: https://gerrit.openafs.org/13410 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 16b981ec6697b511c36c09adfeb8b79eaf2345b0 Author: Mark Vitale Date: Thu Nov 15 15:41:24 2018 -0500 afs: remove dead code afs_osi_SetTime afs_osi_SetTime() has been dead code since -settime support was removed with commit 1d9888be486198868983048eeffabdfef5afa94b 'Remove -settime/RXAFS_GetTime client support'. Remove the dead code. No functional change is incurred by this commit. Change-Id: Ie5559325b4c98d7e0786c75ae6507ab9c2c47376 Reviewed-on: https://gerrit.openafs.org/13393 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit aa80f892ec39e2984818090a6bb2047430836ee2 Author: Mark Vitale Date: Thu Nov 15 15:31:37 2018 -0500 Linux 4.20: do_settimeofday is gone With Linux commit 976516404ff3fab2a8caa8bd6f5efc1437fed0b8 'y2038: remove unused time interfaces', do_settimeofday() is gone. However, OpenAFS only calls do_settimeofday() from afs_osi_SetTime(), which has been dead code since -settime support was removed from afsd with commit 1d9888be486198868983048eeffabdfef5afa94b 'Remove -settime/RXAFS_GetTime client support'. Instead of fixing afs_osi_SetTime() to use a current Linux API, remove it as dead code. No functional change is incurred by this commit. However, this change is required in order to build OpenAFS on Linux 4.20. Change-Id: I74913deb249de66b0da71539f2596c971f0fd99a Reviewed-on: https://gerrit.openafs.org/13392 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 3c454b39d04f4886536267c211171dae30dc0344 Author: Mark Vitale Date: Tue Nov 13 11:20:09 2018 -0500 Linux 4.20: current_kernel_time is gone With Linux commit 976516404ff3fab2a8caa8bd6f5efc1437fed0b8 'y2038: remove unused time interfaces' (4.20-rc1), current_kernel_time() has been removed. Many y2038-compliant time APIs were introduced with Linux commit fb7fcc96a86cfaef0f6dcc0665516aa68611e736 'timekeeping: Standardize on ktime_get_*() naming' (4.18). According to Documentation/core-api/timekeeping.rst, a suitable replacement for: struct timespec current_kernel_time(void) would be: void ktime_get_coarse_real_ts64(struct timespec64 *ts)) Add an autoconf test and equivalent logic to deal. Change-Id: I4ff622ad40cc6d398267276d13493d819b877350 Reviewed-on: https://gerrit.openafs.org/13391 Tested-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit bfb2ebdfc2c0bfd252a14ddbe1681ab22b6733c5 Author: Andrew Deason Date: Mon Oct 15 16:10:59 2018 -0500 ubik: calloc ubik_dbase Instead of using malloc and initializing various fields to 0, allocate our ubik_dbase using calloc, to more easily ensure all fields are initialized. Change-Id: I5c2f345a82a2eb73d53ffc3e1b0fa408af6a8311 Reviewed-on: https://gerrit.openafs.org/13363 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 84b3e1c43685862c147603627a020a68650d6e1c Author: Mark Vitale Date: Fri Oct 26 09:12:44 2018 -0400 viced: fix typo in help for option -unsafe-nosalvage Change-Id: I4e72533747250cee1b7d8c091c63c78948be6c28 Reviewed-on: https://gerrit.openafs.org/13367 Reviewed-by: Stephan Wiesand Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit d058acb354cab9856303cc341a1f439e4f7f3454 Author: Mark Vitale Date: Thu Oct 25 10:27:41 2018 -0400 viced: correct option parsing for -vlru*, -novbc Commit a5effd9f1011aa319fdf432c67aec604053b8656 "viced: Use libcmd for command line options" modernized the option parsing for (da)fileserver, but introduced a few errors for the following options: -vlruthresh -vlruinterval -vlrumax -novbc Correct the errors. Change-Id: If57dfabaa8d4e456b63d47694d288bd8c4235ad2 Reviewed-on: https://gerrit.openafs.org/13365 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit a8219383946b821a907d75b02b7255ca1a162d23 Author: Andrew Deason Date: Sat Oct 20 16:56:01 2018 -0500 budb: Remove db.lock Ever since commit dc8f18d6 (Protect ubik cache accesses), the 'lock' field in struct memoryDB has been unused. Remove it from the struct definition. Change-Id: I90131421ae2e2322debf4249e7464126480832d1 Reviewed-on: https://gerrit.openafs.org/13362 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 7eeec611811ad81f55de4befd70ed47466a5b248 Author: Andrew Deason Date: Sat Oct 20 16:56:57 2018 -0500 ubik: Remove version_cond Several areas in the code do something like this whenever the database version is changed: #ifdef AFS_PTHREAD_ENV opr_cv_broadcast(&ubik_dbase->version_cond); #else LWP_NoYieldSignal(&ubik_dbase->version); #endif However, ever since commit 3fae4ea1 (ubik: remove unused code), nothing in the tree waits for this condvar, so it currently doesn't do anything. Remove this unneeded code. Change-Id: I6903ed89f9dcee2ce154be8883d656d297c97902 Reviewed-on: https://gerrit.openafs.org/13361 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 0f65b40b24599d58cf30bfd47fae83ab54e1416a Author: Andrew Deason Date: Wed Oct 17 16:35:36 2018 -0500 Remove one more automake VERSION reference The configure summary was still referencing the old automake-specific VERSION var. Use the autoconf PACKAGE_VERSION var instead, so this actually shows our version. Change-Id: I18007935d0235931f1d2e023abddee7356e8ac2d Reviewed-on: https://gerrit.openafs.org/13360 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit db38561dea2dc092dcd74082676b2a7c7f56b51c Author: Michael Meffie Date: Wed Apr 4 18:42:46 2018 -0400 autoconf: remove unnecessary mkdir during configure Remove an uneeded mkdir command to create the JAVA/libjafs object directory, since this directory is automatically created by the config.status when generating the JAVA/libjafs/Makefile. Change-Id: Ib02a38c5c23790cb07e5c2433fd4870e8763c3a3 Reviewed-on: https://gerrit.openafs.org/12994 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit eb47fa9879785a8a88ef041667845bb4d005b77e Author: Michael Meffie Date: Wed Apr 4 18:20:02 2018 -0400 autoconf: remove spurious no-op Change-Id: I27242481dc3039f6776deb89e31793deee7f2840 Reviewed-on: https://gerrit.openafs.org/12993 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit b1b3322a68d50318c44caeb7889fd181dc441149 Author: Michael Meffie Date: Wed Apr 4 18:13:24 2018 -0400 autoconf: fix pio checks name The autoconf macro to perform the positional i/o checks was misnamed as hpux checks (since there happens to be a specific check for hpux at the top of the macro). Change the macro name and m4 file name to be more accurately named. Change-Id: Ib85728fbfe67930cb5f9f1f0e34f7aa1195fdfc6 Reviewed-on: https://gerrit.openafs.org/12992 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 65b55bcc26f69f25c67518f672b34be73f3be370 Author: Michael Meffie Date: Thu Dec 21 11:59:38 2017 -0500 vol: avoid query for parent id when deleting disk header When a DAFS volume server removes a volume disk header file (V*.vol), the volume server invokes an fssync command to have the file server delete the Volume Group Cache (VGC) entry corresponding to the volume id and the parent id of the removed volume header. The volume parent id is unknown to the volume server when removing a volume disk header on behalf of a "vos zap -force" operation. In this case, the volume server issues a fssync query to attempt look up to the parent id from the file server's VGC. If this fssync query fails for some reason, volume server is unable to delete the VGC entry for the deleted volume header. The volume server logs an error and vos zap reports a undocumented error code. One common way this can be encountered is to issue a "vos zap -force" on a file server that has just been restarted. In this case, the VGC may not be fully populated yet, so the volume server is not able to look up the parent id of the given volume. With this commit, relax the requirement for the parent id when deleting VGC entries. A placeholder of 0 is used to mean any parent id for the given volume id. This obviates the need to query for the parent id when performing a "vos zap -force", and allows the volume server to remove any VGC entries associated with the volume id being zapped. Change-Id: Iee8647902d93a3c992fca4c4f3880a3393f0b95f Reviewed-on: https://gerrit.openafs.org/12839 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 2f2c2ce62aa17ecac3651d64c1168af926f7458b Author: Andrew Deason Date: Thu Oct 11 00:18:17 2018 -0500 Remove automake autoconf vars Commit 4706854f (autoconf: updates and cleanup) removed our invocation of AM_INIT_AUTOMAKE, which defines the output variables PACKAGE and VERSION. Several files in our build system are still referencing @PACKAGE@ and @VERSION@, though, leaving them un-substituted. This most easily is seen as the AFSVersion version string remaining as "@VERSION@" when the tree is built without git, but it also affects some packaging in the tree. Remove references to @VERSION@ and @PACKAGE@, replacing them with their autoconf equivalents @PACKAGE_VERSION@ and @PACKAGE_TARNAME@. Change-Id: I6c6a09a46c4af4259009a4a60cfdaee63d6258c2 Reviewed-on: https://gerrit.openafs.org/13357 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit d75bc6370f625479a67c7c0a50cce23c4d4a4ce5 Author: Andrew Deason Date: Fri Sep 28 17:12:40 2018 -0500 afs: Remove afs_xosi Since OpenAFS 1.0, all platforms in libafs have a lock called afs_xosi, which is acquired and released around calls like VOP_GETATTR on cache files. However, this lock doesn't appear to protect anything; on all platforms, the code that runs while the lock is held uses only calls VOP_GETATTR and accesses local variables (aside from afs_osi_cred, which we use similarly in many other places). The purpose of the lock has never been documented, and is not mentioned at all in the afs_rwlocks text file. The comment by the afs_xosi lock declaration suggests that the lock was originally introduced to protect access to 'tvattr', which perhaps was a global variable in the past. All uses of 'tvattr' are local now, though, so protecting access to it doesn't make any sense. So, remove afs_xosi, to remove the unnecessary serialization of VOP_GETATTR calls. Change-Id: Ib3764600ae0155057361418c86b49a3507bdcd94 Reviewed-on: https://gerrit.openafs.org/13350 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 0548ee436d0f0f92a980d22e03149faedf38dc70 Author: Andrew Deason Date: Mon Oct 1 11:56:53 2018 -0400 afs: Free 'addrs' array Currently, 3 places in libafs allocate an 'addrs' array in a very similar way to loop through our list of servers: ForceAllNewConnections(), afs_LoopServers(), and PCallBackAddr(). Of these, only afs_LoopServers actually frees the array. ForceAllNewConnections and PCallBackAddr leak the memory, but these are only hit from infrequent pioctls that can only be run by root, so the impact is small. Fix ForceAllNewConnections and PCallBackAddr to free the array. Change-Id: Ic348e29cefa7c41cbcb30f738f943e8d022a97f0 Reviewed-on: https://gerrit.openafs.org/13355 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 2aeabf8c5bca22b400653e2bc88b6f36d47b05ca Author: Marcio Barbosa Date: Sun Sep 30 17:38:53 2018 -0400 macos: packaging support for MacOS X 10.14 This commit introduces the new set of changes / files required to successfully create the dmg installer on OS X 10.14 "Mojave". Change-Id: Ia1238b454350777bbfbf3dfd2be0c6c523348928 Reviewed-on: https://gerrit.openafs.org/13349 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 72b2670a9e2e3937ed4e47485b9e9fa6953b5444 Author: Marcio Barbosa Date: Wed Sep 26 00:18:38 2018 -0300 macos: add support for MacOS 10.14 This commit introduces the new set of changes / files required to successfully build the OpenAFS source code on OS X 10.14 "Mojave". Change-Id: Ib7cbd531ad6db3340d59e76abdecbe75886a4d5c Reviewed-on: https://gerrit.openafs.org/13348 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit bd58bb85004a18bb6681ff2b0c13a04e23c4d9c4 Author: Marcio Barbosa Date: Mon Oct 1 17:44:22 2018 -0400 auth: check if argument of afsconf_Close* is null Currently, we do not check if the argument of afsconf_Close / afsconf_CloseInternal is equal to null. In order to avoid a possible segmentation fault, add the checks. Change-Id: I45635ad2d735505637072867edb7ff17da3c671a Reviewed-on: https://gerrit.openafs.org/13352 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk Reviewed-by: Michael Meffie commit 0835d7c2a183f896096684df06258aefd297f080 Author: Michael Meffie Date: Fri Mar 16 09:25:18 2018 -0500 afs: make sure to call afs_Analyze after afs_Conn The afs_Conn function is used to pick a connection for a given RPC. The RPC is normally wrapped within a do-while loop which calls afs_Analyze to handle the RPC code and manage the server connection references. Among other things, afs_Analyze can mark the server as down, blacklist idle servers, etc. There are some special cases in which we break out of this do-while loop early, by putting the connection reference given by afs_Conn and then jumping out of the loop. In these cases, be sure to call afs_Analyze to put the server connection we got from afs_Conn, and to handle the RPC return code, possibly marking the server as down or blacklisted. Change-Id: Ic2c43f20d153376b93d79bbb5145914f8e478957 Reviewed-on: https://gerrit.openafs.org/13288 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 07ed94cfd817dc5a4e2d2712570087388fe7828f Author: Marcio Barbosa Date: Fri Oct 5 11:26:34 2018 -0400 DARWIN: replace macro exported by automake Commit 4706854f57043c8393baa922dd1974176e110a19 removed automake references from the source tree. As a result, VERSION (exported by AM_INIT_AUTOMAKE and obtained from Autoconf's AC_INIT macro) is not available anymore. Unfortunately, a reference to this macro can be found in src/afs/DARWIN/osi_module.c. Consequently, builds on OS X fail with the following message: osi_module.c:144:32: error: use of undeclared identifier 'VERSION' To fix this problem, replace VERSION by PACKAGE_VERSION (defined by AC_INIT). Change-Id: Ib3821d79c4cddd59c399985762e13dec755d8642 Reviewed-on: https://gerrit.openafs.org/13354 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit f0bab78cbe4f59609fa18647a480cc6989948786 Author: Michael Meffie Date: Mon Oct 1 11:38:37 2018 -0400 ubik: do not reuse the offset variable for the sync site address The ubik SendFile function performs a sanity check of the host address before proceeding with the file transfer. Currently this check reuses the file offset local variable to hold the value of the sync site address, a 32-bit IPv4 address. Not only is this confusing, but also causes a signed/unsigned type mismatch when comparing host addresses. Instead of being so stingy with local variables, declare a new local variable of the correct type to hold the value of the sync site address. This separation is also a prerequisite for supporting larger address types in the future. Change-Id: I116fe210f418e6914afeff37c44d30bf795e2413 Reviewed-on: https://gerrit.openafs.org/13351 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit d7ae7df42ced260471578dccc160f2f7a5bc686e Author: Andrew Deason Date: Mon Sep 24 15:41:23 2018 -0500 vlserver: Remove sascnvldb "sascnvldb" appears to be a variant of cnvldb that was used to convert vldb database blobs from even older versions than what cnvldb handles. However, it has never been built by default (some makefile rules reference the program, but it's never built unless the user explicitly runs 'make sascnvldb'), and it currently cannot build due to a variety of compiler errors. Remove the dead code. Change-Id: I5692d2cd058aa4ae9222ce25721001aabcca5eb7 Reviewed-on: https://gerrit.openafs.org/13345 Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 0796de43eaceb3a28799ad0bbe11e335a3f919bc Author: Mark Vitale Date: Fri Jun 22 16:52:08 2018 -0400 fsint: remove dead code The last references to these objects were removed with commit 3828c257ae33306bbdd3c6db9381601fe5b1b110 "dead-code-and-prototyes-20060214". A few mentions of CBS and BBS are left in the documentation as historical references: - doc/man-pages/pod1/rxgen.pod - src/kauth/AuthServer.mss Change-Id: Ia24eef7bb1509ff10d11de5c51e688e27f69417a Reviewed-on: https://gerrit.openafs.org/13324 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 77ae3dc899e89f327328c874628f100a765846c4 Author: Michael Meffie Date: Fri Apr 4 10:27:10 2014 -0400 cmd: improve help for programs without subcommands Some programs do not have subcommands (other than the standard "help", and "version" subcommands). The cmd library provides the "noopcode" mechanism for new subcommand-less programs, but older programs take advantage of the optional "initcmd" token to simulate subcommand-less programs. The "initcmd" token is optional to run the command, however it is required to display the command help. For example, running the xstat_cm_test program without any options gives a syntax error: $ xstat_cm_test xstat_cm_test: Missing required parameter '-cmname' ... Retrying with -help (or help, -h, --help), gives the rather unhelpful output: $ xstat_cm_test -help xstat_cm_test: Commands are: apropos search by help text help get help on commands initcmd initialize the program It is not obvious to the user how to get the command usage for the program, nor that the initcmd subcommand to "initialize the program" is actually is a placeholder to run the program. Instead, display the command usage when help is requested and initcmd is the only defined subcommand for a program. For example: $ xstat_cm_test -help Usage: src/xstat/xstat_cm_test [initcmd] -cmname + -collID + [-onceonly] [-frequency ] [-period ] [-debug] [-help] Where: -onceonly Collect results exactly once, then quit -debug turn on debugging output The libcmd library now supports an "noopcode", which should used for future subcommand-less programs, but converting old programs to remove the initcmd opcode could break scripts which actually specify the optional initcmd token. This commit adds a new libcmd flag called CMD_IMPLICIT which is used to denote built-in subcommands such as "version" and "help". Change-Id: Iee9cb2761254543f74166e5c240685f85b6915b6 Reviewed-on: https://gerrit.openafs.org/10983 Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 2daa413e3ec061e0653adbd1d6549f15e0659a62 Author: Andrew Deason Date: Tue Aug 7 17:27:24 2018 -0500 Avoid format truncation warnings With gcc 7.3, we start getting several warnings like the following: vutil.c: In function ‘VWalkVolumeHeaders’: vutil.c:860:34: error: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size 63 [-Werror=format-truncation=] snprintf(name, VMAXPATHLEN, "%s" OS_DIRSEP "%s", partpath, dentry->d_name); Most or all of these truncations should be okay, but increase the size of the relevant buffers so we can build with warning checking turned on. Change-Id: Iac62d6fcfa46f523c34bf1b0ebc2770d3d67c174 Reviewed-on: https://gerrit.openafs.org/13274 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit fa6edf73d4bbe39012f3431c60584a282a823233 Author: Andrew Deason Date: Tue Sep 25 16:52:14 2018 -0500 vlserver: Remove 'register' argument Commit 4a531cb7 (death to register) removed the 'register' declaration from variables/arguments. But commit 3bf03502 (vlserver: Add a struct for trans-specific data) accidentally added one back in at around the same time, probably due to a rebase/merge mistake. Take the 'register' declaration back out. Change-Id: I73f206a57ab6b97195771e39556d2b0064be4cf3 Reviewed-on: https://gerrit.openafs.org/13346 Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 4a2b5101afda24b2d937e7350ca35b0b3d3c4af8 Author: Benjamin Kaduk Date: Wed May 30 19:38:57 2018 -0500 CellServDB update 14 May 2018 Update all three copies in the tree, and the rpm specfile. Change-Id: I572ff4e39ab757128f0082a4f447565e94b8dee5 Reviewed-on: https://gerrit.openafs.org/13134 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 02dede5d40a55421ab4f093c1c90b8f785a40ec1 Author: Andrew Deason Date: Mon Aug 20 14:53:35 2018 -0500 Log binding ip address and port during startup Many daemons currently have the ability to bind to a specific ip address using the -rxbind parameter. The behavior can be a little unintuitive, however, since we only bind to the ip address we find via NetInfo/NetRestrict processing, and only if we end up with a single ip address. Since that processing involves examining the set of ip addresses available, this can have confusing results if, for instance, a daemon starts up while an administrator is changing the local ip configuration. If a daemon binds to a different ip address than the administrator expects, this can be very confusing, especially since for most daemons we don't log our bound ip anywhere. To help alleviate this, change the startup code for all of our daemons to log what ip we are trying to bind to (or "0.0.0.0" if none), along with our local port. Change-Id: I18d3647c4d134177a0a17c6a64583d444558a9f6 Reviewed-on: https://gerrit.openafs.org/13272 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 794748af87134d0b89fbca3be6e0480a96a0655c Author: Michael Meffie Date: Tue Oct 10 22:57:01 2017 -0400 fsprobe: add fsprobe_Wait function Move the lwp code to wait in the fsprobe applications down to the fsprobe library. This is a non-functional change in anticipation of converting the fsprobe library and programs to pthreads. Change-Id: I2972b13e2e3eeb691c64c91b0640bbc97e7d0b21 Reviewed-on: https://gerrit.openafs.org/12747 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 2c1a7e47336c8f8d14dd6c65d53925a9e0e87c66 Author: Michael Meffie Date: Mon Oct 9 22:23:31 2017 -0400 xstat: add xstat_*_Wait functions Add the xstat_cm_Wait and xstat_fs_Wait functions and move the code to wait for the xstat data collection to complete from the applications down to the xstat library. This is a non-functional change in anticipation of converting the xstat library and programs to pthreads. Change-Id: Ifd1d6bcda618c89b4ce46e1e64f33b0b30a89a72 Reviewed-on: https://gerrit.openafs.org/12746 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 5ced6025b9f11fadbdf2e092bf40cc87499ed277 Author: Andrew Deason Date: Thu Nov 2 16:41:52 2017 -0500 rx: Convert rxinit_status to rx_IsRunning() Currently, all rx code examines the atomic rxinit_status to determine if rx is running (that is, if rx_InitHost has been called, and rx_Finalize/shutdown_rx hasn't been called). This is used in rx.c to see if we're redundantly calling our setup/teardown functions, and outside of rx.c in a couple of places to see if rx-related resources have been initialized. The usage of rxinit_status is a little confusing, since setting bit 0 indicates that rx is not running, and clearing bit 0 indicates rx is running. Since using rxinit_status requires atomic functions, this makes code checking or setting rxinit_status a little verbose, and it can be hard to see what it is checking for. (For example, does 'if (!rx_atomic_test_and_clear_bit(&rxinit_status, 0))' succeed when rx running, or when rx is not running?) The current usage of rxinit_status in rx_InitHost also does not handle initialization errors correctly. rx_InitHost clears rxinit_status near the beginning of the function, but does not set rxinit_status if an error is encountered. This means that any code that checks rxinit_status (such as another rx_InitHost call) will think that rx was initialized successfully, but various resources aren't actually setup. This can cause segfaults and other errors as the code tries to actually use rx. This can easily be seen in bosserver, if bosserver is started up while the local host/port is in use by someone else. bosserver will try to rx_InitHost, which will fail, and then we'll try to rx_InitHost again, which will immediately succeed without doing any init. We then segfault quickly afterwards as we try to use unitialized rx resources. To fix all of this, refactor code using rxinit_status to use a new function, called rx_IsRunning(), to make it a little clearer what we're checking for. We also re-introduce the LOCK_RX_INIT locks to prevent functions like rx_InitHost and rx_Finalize from running in parallel. Note that non-init/shutdown code (such as rx_upcall or rx_GetIFInfo) does not need to wait for LOCK_RX_INIT to check if rx is running or not. These functions only care if rx is currently setup enough to be used, so we can immediately return a 'yes' or 'no' answer. That is, if rx_InitHost is in the middle of running, rx_IsRunning returns 0, since some resouces may not be fully initialized. Change-Id: Ia14a6a725c9662b9db0adef48c33b48a93ffe051 Reviewed-on: https://gerrit.openafs.org/12761 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 00aa9200be86b187c903503e56b2af55639ea2b8 Author: Andrew Deason Date: Sat Sep 22 01:58:17 2018 -0500 SOLARIS: Fix libafs $(KOBJ) parallel make race Currently, our COMPDIRS make rule for SOLARIS libafs builds looks like this: ${COMPDIRS} ${INSTDIRS} ${DESTDIRS}: for t in $(KOBJ) ; do # set some variables ; \ cd $$t ; \ $(MAKE) $@_libafs || exit $$? ; \ cd ../ ;\ done And Makefile.common has this: all: setup $(COMPDIRS) Where the 'setup' rule creates the $(KOBJ) dirs and sets up some symlinks. For parallel builds, this means that our commands in the ${COMPDIRS} target can be running in parallel with the 'setup' target, and so our $(KOBJ) dirs may not exist by the time we try to 'cd $$t'. For single-KOBJ platforms this actually largely works, since the 'cd' will fail, but then the subsequent 'make' will run (just in the wrong dir), but this can cause us to wastefully re-compile the same source files (and cause some possibly confusing error messages). For platforms with multiple KOBJs, this causes obvious problems, since we don't cd into each KOBJ dir. To solve this, just have the ${COMPDIRS}/etc rule depend on setup, so we know that 'setup' has finished running. Also change our way of 'cd'ing into each KOBJ dir to actually cause the rule to fail, to make any errors here more obvious and consistent. Change-Id: Id2e662f36ef47a6182716728167b2da4713893c6 Reviewed-on: https://gerrit.openafs.org/13344 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 68be8d7a1884fe678016b5ea20c16b3b124e8406 Author: Andrew Deason Date: Fri Sep 21 22:13:25 2018 -0500 SOLARIS: Fix platforms for KOBJ definition Currently, we define KOBJ to "MODLOAD32 MODLOAD64" for the following platforms: Which doesn't make any sense, since "all" includes sun4x_511 and sunx86_511. The previous commits that modified this line, e4c2810f (Remove support for Solaris pre-8) and c6a22d67 (SOLARIS: Do not build x86 kernel module on 5.11), clearly meant to change the platforms sun4x_511 and sunx86_511 to use the KOBJ on the next line, but omitted the leading "-" for the platform. This doesn't break anything, since the Makefile on these platforms expands to: KOBJ = MODLOAD32 MODLOAD64 KOBJ = MODLOAD64 So the first KOBJ line is effectively ignored. It's confusing, though, so fix this line so these platforms only get one KOBJ definition. Change-Id: Idea9fdee4ac5883428748c2a5fdfa9707406436a Reviewed-on: https://gerrit.openafs.org/13343 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit c1d39153da00d5525b2f7874b2d214a7f1b1bb86 Author: Andrew Deason Date: Thu Sep 6 13:42:11 2018 -0500 Run ctfconvert/ctfmerge for all objects Commit 88cb536f (autoconf: detect ctf-tools and add ctf to libafs) introduced running ctfconvert and ctfmerge for libafs on Solaris, but didn't add any CTF data for userspace code. This commit causes the same commands to be run for every binary that we build (if the ctf tools are available). To accomplish this, also refactor how we run ctfconvert and ctfmerge. The approach in commit 88cb536f would require us to modify the makefile rule for every executable to run RUN_CTFCONVERT and RUN_CTFMERGE, which is somewhat impractical. So instead in this commit, we modify all of our *_CCRULE and *_LDRULE variables to wrap the compiler invocation with the new CC_WRAPPER script. This means our *RULE variables change from something like this: FOO_CCRULE = $(RUN_CC) $(CC) $(XXX_FLAGS) -o $@ to something like this: FOO_CCRULE = $(RUN_CC) $(CC_WRAPPER) $(CC) $(XXX_FLAGS) -o $@ CC_WRAPPER expands to the script src/config/cc-wrapper, which just runs ctfconvert or ctfmerge on the relevant files after the compiler/linker runs. If the CTF tools are not configured, CC_WRAPPER expands to nothing, to limit our impact on other platforms. This commit was developed in collaboration with mbarbosa@sinenomine.net. Change-Id: Id19ba9d739edc68f01c2db7d5caa20758ec8144a Reviewed-on: https://gerrit.openafs.org/13308 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 78ed034603781a979687a45c08eb8b13e515e8bf Author: Andrew Deason Date: Tue Aug 7 11:17:43 2018 -0500 Call rx_InitHost once during daemon startup Currently, a few daemons calls rx_InitHost in different places, and under different conditions. For example, vlserver calls rx_InitHost only when we -rxbind to a specific ip address, and then also makes an additional rx_Init call. Other daemons always call rx_InitHost, or just call rx_InitHost sometimes and don't make an extra rx_Init call. To try to make the various daemons behave a little more consistently, change the startup code to always call rx_InitHost, and to only call it once. Note that rx_InitHost is the same as calling rx_Init with INADDR_ANY as the ip address, and calling rx_Init* after a previous rx_Init* call is effectively a no-op. Change-Id: Ifd15175349a7b4695e684ca82deb8a8af5063073 Reviewed-on: https://gerrit.openafs.org/13271 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 38a094137f067255c586dd5c85f3040d7a7c4486 Author: Andrew Deason Date: Fri Sep 21 17:16:52 2018 -0500 pthread.m4: Add missing 'test' to conditional Commit c5def62d (autoconf: update pthread checks) accidentally omitted a 'test' in one of the conditionals. This causes an ugly error message during configure: checking for pthread_attr_init in -lpthread... yes ./configure[31043]: x-lpthread: not found [No such file or directory] Replace the missing 'test'. Change-Id: I28b82594e43a4ab42a5eb9fcc78e0ce8c5517d8b Reviewed-on: https://gerrit.openafs.org/13342 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 3fae4ea19a175aed7ff3f6e9c7fdf2aa2f1b5cb3 Author: Mark Vitale Date: Wed Nov 9 16:58:00 2016 -0500 ubik: remove unused code ubik_GetVersion and ubik_WaitVersion have been unused since at least OpenAFS 1.0. Remove them. No functional change should be incurred by this commit. Change-Id: Iee6952f35d8c34e9f05a4e6011f5795f7222fb08 Reviewed-on: https://gerrit.openafs.org/13325 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 809ee49b80d7bc0e720aaebe78fb9ecfd453065d Author: Andrew Deason Date: Fri Sep 21 12:11:46 2018 -0500 Remove alpha_dux/alpha_osf references Several files were still referencing the alpha_dux* and alpha_osf* sysnames. The code for these platforms has been removed, so get rid of this cruft. Change-Id: I042fcc29be322bf557829974242553bb6d5b2be4 Reviewed-on: https://gerrit.openafs.org/13339 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 42625220bb615e2fd7f0dc24e50a502e0596e546 Author: Andrew Deason Date: Fri Sep 21 12:03:37 2018 -0500 libafs: Remove .i Makefile rules Makefile.common.in defines a suffix rule to generate .i files from .c files, but we never actually need to do this. The rule originates from before OpenAFS 1.0, which also did not use the rule. Remove the unused definitions. Change-Id: I057b2aca7d17e3e85e93d886a65c954e8d9d708f Reviewed-on: https://gerrit.openafs.org/13338 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 930d8ee638112ca8bf27a9528c0a527cfab54c7d Author: Mark Vitale Date: Fri Aug 17 18:48:08 2018 -0400 volser: ensure GCTrans transaction walk remains valid Commit bc56f5cc97a982ee29219e6f258b372dbfe1a020 ("volser: Delete timed-out temporary volumes") introduced new logic to GCTrans(). Unfortunately, part of this logic temporarily drops VTRANS_LOCK in order to call VPurgeVolume(). While this lock is dropped, other volser_trans may be added or deleted from the allTrans list. Therefore, GCTrans should not trust the next pointer (nt = tt->next) which was obtained before the lock was dropped. One symptom observed in the field was a segfault while examining tt->volume. Neither tt nor volume were valid any longer, since tt had been set from a stale nt at the top of the loop. To repair, improve, and clarify this logic: - Refactor so nt is assigned correctly and as late as possible. - Add comments to explain the placement of the assigns to future maintainers. Change-Id: Ibd3a504bddd3622730aa349576341e20f2f27836 Reviewed-on: https://gerrit.openafs.org/13286 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 89b50fdec9ab2dafe24b873f25c2cdb71b154e44 Author: Marcio Barbosa Date: Sat Aug 11 15:51:05 2018 -0400 volser: add more logs for failures during restore In the current version of the volserver, some failures during volume restores are not logged. In order to help debugging, this commit introduces extra logs for possible failures during this process, so we guarantee that an error at any point during the restore causes a message to be logged. Change-Id: I3647155aeb3f10316d9d7fecb5b126efc909f7b4 Reviewed-on: https://gerrit.openafs.org/13252 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 7c27365ea24aed5787f6fc03f30f6085c78ece51 Author: Michael Meffie Date: Mon Oct 9 22:16:09 2017 -0400 afsmonitor: remove unused LWP_WaitProcess Remove the unimplemented once-only flag and the unused LWP_WaitProcess call. Change-Id: Idec5815f6f20019b9be4b973794d8b05cea7f6c9 Reviewed-on: https://gerrit.openafs.org/12745 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 95b0641ad8cfd0358576c6e1a93266fc59ecf710 Author: Mark Vitale Date: Thu Sep 6 14:09:26 2018 -0400 volser: combine GCTrans conditional clauses In preparation for a future commit, combine two conditional clauses in GCTrans(). No functional change should be incurred by this commit. Change-Id: Ib08d5b83dd26327124fe0119e6e5f459adc5f78a Reviewed-on: https://gerrit.openafs.org/13303 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit f62fb17b3cf1c886f8cfc2fabe9984070dd3eec4 Author: Michael Meffie Date: Tue Apr 19 20:46:33 2016 -0400 ubik: positional io for db reads and writes The ubik library was written before positional i/o was available and issues an lseek system call for each database file read and write. This change converts the ubik database accesses to use positional i/o on platforms where pread and pwrite are available, in order to reduce system call load. The new inline uphys_pread and uphys_pwrite functions are supplied on platforms which do not supply pread and pwrite. These functions fall back to non-positional i/o. If these symbols are present in the database server binary then the server process will continue to call lseek before each read and write access of the database file. This change does not affect the whole-file database synchronization done by ubik during database recovery (via the DISK_SendFile and DISK_GetFile RPCs), which still uses non-positional i/o. However, that code does not share file descriptors with the phys.c code, so there is no possibility of mixing positional and non-positional i/o on the same FDs. Change-Id: I28accd24f7f27b5e8a4f1dd0e3e08bab033c16e0 Reviewed-on: https://gerrit.openafs.org/12272 Reviewed-by: Andrew Deason Reviewed-by: Marcio Brito Barbosa Reviewed-by: Michael Meffie Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 8375a7f7dd0e3bcbf928a23f874d1a15a952cdef Author: Marcio Barbosa Date: Sat Aug 11 14:00:18 2018 -0400 volser: warn if older version of volume is restored Volume restores work by overwriting vnodes with the data in the given volume dump. If we restore a partial incremental dump from an older version of the volume, this generally results in a partly-corrupted volume, since directory vnodes may contain references that don't exist in the current version of the volume (or are supposed to be in a different directory). Currently, the volserver does not prevent restoring older volume data to a volume, and this doesn't necessarily always result in corrupted data (for instance, if we are restoring a full volume dump over an existing volume). But restoring old volume data seems more likely to be a mistake, since reverting a volume back to an old version, even without corrupting data, is a strange thing to do and may cause problems with our methods of cache consistency. So, log a warning when this happens, so if this is a mistake, it doesn't happen silently. But we still do not prevent this action, since it's possible something could be doing this intentionally. We detect this just by checking if the updateDate in the given header is older than the current updateDate for the volume on disk. Note: Restoring a full dump file (-overwrite f) will not result in corrupted data. In this scenario, the restore operation removes the volume on disk first (if present). After that, the dump file is restored. In this case, we do not log anything (the volume is not corrupted). Change-Id: Iac55cc8bb1406ca6af9a5e43e7d37c6bfa889e91 Reviewed-on: https://gerrit.openafs.org/13251 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit dc25b9f509385bef7e6f73f03a796ea033922300 Author: Michael Meffie Date: Fri Oct 27 23:25:10 2017 -0400 update: convert upserver and client from LWP to pthreads Build the upserver and the upclient with phreads instead of LWP and convert the IOMRG sleeps in the client to regular sleeps. Change-Id: I183765ef180f34d38b87a13ec49f16f4a60afcc8 Reviewed-on: https://gerrit.openafs.org/12754 Tested-by: BuildBot Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit 066b3a9fd7a4d99e8aefe5cc20e57b31b137f979 Author: Pat Riehecky Date: Fri Jun 1 16:29:25 2018 -0500 Correct some redundant if() clauses A few if() conditions currently contain redundant syntax, due to typos. Fix the conditions to actually check different things, according to what the author probably originally intended. (via cppcheck) Change-Id: I7e46217e1f84fe65677ada345d227f31f1988fe6 Reviewed-on: https://gerrit.openafs.org/13157 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 6892bfbd701899281b34ee337637d438c7d8f8c6 Author: Michael Meffie Date: Wed Apr 20 18:17:16 2016 -0400 ubik: remove unnecessary lseeks in uphys_open The ubik database file access layer has a file descriptor cache to avoid reopening the database file on each file access. However, the file offset is reset with lseek on each and every use of the cached file descriptor, and the file offset is set twice when reading or writing data records. This change removes unnecessary and duplicate lseek system calls to reduce the system call load. Change-Id: I460b226d81e4eb64dc87918175acab495aa698cd Reviewed-on: https://gerrit.openafs.org/12271 Reviewed-by: Andrew Deason Reviewed-by: Mark Vitale Reviewed-by: Marcio Brito Barbosa Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit da699c8b81e818ba97ff8115397d7f7afe0bf512 Author: Michael Meffie Date: Mon Sep 10 23:47:33 2018 -0400 klog.krb5 -lifetime is not implemented The klog.krb5 -lifetime option was copied from earlier versions of log and klog, which had the ability to set the krb4 token lifetime. However, the -lifetime option is not feasible the krb5 version, and so is not implemented in klog.krb5. Update the klog.krb5 man page to document the -lifetime option has no effect. Remove the code which unnecessarily checks the unused klog.krb5 -lifetime command line argument. The unused lifetime variable was discovered by Pat Riehecky using the clang scan-build static analyzer. Change-Id: I5f459ec46eaff87a69ccdf7de386a671d0944a5a Reviewed-on: https://gerrit.openafs.org/13309 Tested-by: BuildBot Reviewed-by: PatRiehecky Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 8f314560c9b00acb63e1929503f6bf2e43bb1ff6 Author: Michael Meffie Date: Tue Sep 11 12:03:30 2018 -0400 util: add defines for ktime never and now values Add preprocessor symbolic names for ktime values representing never and right now. The names are intended to be consistent with the ktime date never value definition. This commit does not make any functional change. Change-Id: Ia6735b585e50aeb018481f76552fbb4f607b8529 Reviewed-on: https://gerrit.openafs.org/13310 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 800318b43fdf461ad95cd7f3940718f3f0a609a7 Author: Andrew Deason Date: Thu May 10 16:22:52 2018 -0500 ubik: Buffer log writes with stdio Currently, when we write ubik i/o operations to the db log, we tend to issue several syscalls involving small writes and fstat()s. This is because each "log" operation involves at least one write, and each log operation tends to be pretty small. Each logged operation hitting disk separately is unnecessary, since the db log does not need to hit the disk at all until we are ready to commit the transaction. So to reduce the number of syscalls when writing to the db, change our log writes to be buffered in memory (using stdio calls). This also avoids needing to fstat() the underlying log file, since we open the underlying file in append-only mode, since we only ever append to (and truncate) the log file. To implement this, we introduce a new 'buffered_append' phys operation, to explicitly separate our buffered and non-buffered operations, to try to avoid any bugs from mixing buffered and non-buffered i/o. This new operation is only used for the db log. Change-Id: I5596117c6c71ab7c2d552f71b0ef038f387e358a Reviewed-on: https://gerrit.openafs.org/13070 Reviewed-by: Mark Vitale Reviewed-by: Michael Meffie Reviewed-by: Joe Gorse Reviewed-by: Benjamin Kaduk Reviewed-by: Marcio Brito Barbosa Tested-by: BuildBot commit 93fd6d31ce441c5ab394f31355584d17ef6e455a Author: Marcio Barbosa Date: Mon Sep 10 18:14:55 2018 +0000 autoconf: Use `uname -p` instead of $HOST_CPU for ctf tools Currently, we check if the ctf tools are present searching for them in a few directories. One of these directories (/opt/onbld/bin/$HOST_CPU) looks at the $HOST_CPU variable, which on x86 can be 'x86_64' or 'i386', but the only valid directories for the onbld tools are 'i386' and 'sparc'. So instead of $HOST_CPU, just use $(uname -p), which is only ever 'i386' on x86, and 'sparc' on sparc. [adeason@sinenomine.net: reword commit message] Change-Id: I972cf1cc0dda81f5ee454b14ddbe2830c82c838d Reviewed-on: https://gerrit.openafs.org/13275 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit fa55a3fe77b4adfce4071fe73f02687e65d4e027 Author: Michael Meffie Date: Sat Jun 9 05:16:02 2018 +0000 doc: the last partition name is /vicepiu The last valid partition name supported by OpenAFS is /vicepiu, not /vicepiv. Update the docs and man pages to say so. Change-Id: I6e1cce775d332d76f605a26f16502c651461994b Reviewed-on: https://gerrit.openafs.org/13177 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 3588127191c3ebf2e411212bbea9a33a9081e009 Author: Michael Meffie Date: Sat Jun 9 04:39:49 2018 +0000 tests: partition name to id function tests Add unit tests for the utility functions to convert between partition names and partition ids. Change-Id: I4b12f9d611cb9f3ce49909cda5cbcedd3e6c3d10 Reviewed-on: https://gerrit.openafs.org/13176 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 70926959094497d440daf9a78e1e1ea5a7ddc9b8 Author: Ben Kaduk Date: Mon Dec 9 15:26:06 2013 -0500 Add rxgk_crypto_rfc3961.c rxgk wrappers around an external crypto library, in this case, our in-tree rfc3961 library. Primitives for encryption/decryption and MIC/VerifyMIC, ways to generate and free rxgk_key objects, etc.. Change-Id: I7525086043baf54f5c3019b3f5ab3495760c4236 Reviewed-on: https://gerrit.openafs.org/10565 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 6534b10a4180ec10bceebbc11405718e7969fa21 Author: Andrew Deason Date: Thu Jul 26 15:48:00 2018 -0500 Remove DUX/OSF code Remove code for DUX/OSF platforms. DUX code was removed from the libafs client in commit 392dcf67 ("Complete removal of DUX client code") and the alpha_dux* param files were removed in dc4d9d64 ("afs: Remove AFS_BOZONLOCK_ENV"). This code has always been disabled since those commits, so remove any code referencing AFS_DUX*_ENV, AFS_OSF_ENV, and related symbols. Change-Id: I3787b83c80a48e53fe214fdecf9a9ac0b63d390c Reviewed-on: https://gerrit.openafs.org/13260 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 8ad4e15ffc883c9a99f9636d7d8a5ed0a2fcc26a Author: Marcio Barbosa Date: Tue May 31 09:08:08 2016 -0300 venus: fix memory leak In GetPrefCmd, when we request server prefs from the kernel and our output buffer is not big enough, pioctl() will return E2BIG and we allocate more memory and try again. However, if the size of the output buffer reaches 16k bytes and this space is still not enough (or if pioctl fails and errno != E2BIG), we return without releasing the memory that was previously allocated. To fix this problem, free our output buffer when this happens. Change-Id: Ib34cb12629528ddf2a763386f0ac5494eb8be695 Reviewed-on: https://gerrit.openafs.org/12293 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit c553170bcf3b97ba3745f21040c8e07b128ef983 Author: Jeffrey Altman Date: Wed Jun 6 21:23:14 2018 -0400 rx: reset packet header userStatus field on reuse OpenAFS Rx fails to set the rx packet header userStatus field for most packets sent other than type RX_PACKET_TYPE_ACK. If the userStatus field is not set, its value will be random garbage based upon the prior use of the memory allocated to the rx_packet. This change explicitly sets the userStatus field to zero for all DATA and Special packet types. Background ---------- OpenAFS Rx allocates a pool of rx_packet structures that are reused for both incoming and outgoing Rx packets throughout the lifetime of the process (or kernel module). The rx packet header field userStatus is set by rxi_Send() to rx_call.localStatus. rxi_Send() is called from both rxi_SendAck() when sending RX_PACKET_TYPE_ACK packets and from rxi_SendSpecial() when called with a non-NULL call structure (RX_PACKET_TYPE_BUSY, RX_PACKET_TYPE_ACKALL, or RX_PACKET_TYPE_ABORT). rx_call.localStatus defaults to zero and can be modified by the application calling rx_SetLocalStatus(). The userStatus field is neither set nor reset when sending RX_PACKET_TYPE_DATA packets and all packets sent without a call structure. When allocated packets are reused in these cases, the value of the userStatus leaks from the prior packet use. The userStatus field is expected to be zero unless intentionally set by the application protocol to another value. The AFS3 suite of rx services uses the rx_header.userStatus field only in the RXAFS service and only as part of the definition for RXAFS_StoreData and RXAFS_StoreData64 RPCs. The StoreData RPCs use the rx_header.userStatus field as an out-of-band communication mechanism that permits the fileserver to signal to the cache manager when the RXAFS_StoreData[64] has been assigned to an application worker (thread) and the worker has acquired all of the required locks and other resources necessary to complete the RPC. This signal can be sent before all of the application data has been received. The cache manager reads the userStatus value via rx_GetRemoteStatus(). When bit-0 of the remote status value equals one and CSafeStore mode is disabled, the cache manager can wakeup any threads blocked waiting for the store operation to complete. Cache managers that perform a workload heavy in RXAFS_StoreData[64] RPCs will end up with an increasing percentage of packets in which the userStatus field is one instead of zero. Fileservers processing a workload heavy in RXAFS_StoreData[64] RPCs will likewise end up with an increasing percentage of packets in which the userStatus field is one instead of zero. Cache managers and Fileservers will therefore send DATA and call free special packets with a non-zero userStatus field to peer services (RXAFS, RXAFSCB, VL, PR). The failure to reset the userStatus field has not been a problem in the past because only the OpenAFS cache manager has ever queried the userStatus via rx_GetRemoteStatus() and only when issuing RXAFS_StoreData[64] RPCs. Failure to correct this flaw interferes with future use of the userStatus field in yet to be registered AFS3 RPCs and existing non-AFS3 services that make use of the userStatus when sending data to a service. Change-Id: I32c0bba93b8e5197036d92168956b6e2a95406fb FIXES: 134554 Reviewed-on: https://gerrit.openafs.org/13165 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 2d8045d67686fbb80696b47b4a60e48e7e74fec9 Author: Mark Vitale Date: Tue Sep 11 15:59:41 2018 -0400 budb: SBUDB_FindLatestDump should check result of FillDumpEntry FillDumpEntry may return an error, but FindLatestDump doesn't check its result. Therefore, SBUDB_FindLatestDump may return invalid results. Instead, check the return code from FillDumpEntry and abort the call if it fails. Change-Id: If0b44ba2a12a76511129d77110ef669b00780ff0 Reviewed-on: https://gerrit.openafs.org/13312 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 91bab84e7a3b7de2591c475ba4912b0db8899f05 Author: Mark Vitale Date: Tue Sep 11 16:29:59 2018 -0400 butc: repair build error Commit c43169fd36348783b1a5a55c5bb05317e86eef82 introduced a build error by invoking TLog with an extraneous set of internal parentheses. Remove the offending parentheses. Change-Id: Ibc52501b01ecbe9f86262566446d63e66486272f Reviewed-on: https://gerrit.openafs.org/13311 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit d5816fd6cd1876760a985a817dbbb3940cf3bddb Author: Benjamin Kaduk Date: Tue Sep 11 10:51:01 2018 -0500 Fix typos in audit format strings Commit 9ebff4c6caa8b499d999cfd515d4d45eb3179769 introduced audit framework support for several butc-related data types, but had a typo ('$d' for '%d') in a couple of places, that was not reported by compiler format-string checking. Fix the typo to properly print all the auditable data. Change-Id: Ibefa9f8f1c0567bc6fe606327af26fcb0dbeadba commit 345ee34236c08a0a2fb3fff016edfa18c7af4b0a Author: Benjamin Kaduk Date: Sun Sep 9 10:44:38 2018 -0500 OPENAFS-SA-2018-001 backup: use authenticated connection to butc Use the standard routine to pick a client security object, instead of always assuming rxnull. Respect -localauth as well as being able to use the current user's tokens, but also provide a -nobutcauth argument to fall back to the historical rxnull behavior (but only for the connections to butc; vldb and budb connections are not affected). Change-Id: Ibf8ebe5521bee8d0f7162527e26bc5541d07910d commit 736364f1e3426b7b15836cd95ce25f0e516ce3f2 Author: Benjamin Kaduk Date: Thu Sep 6 18:50:39 2018 -0500 OPENAFS-SA-2018-001 butc: require authenticated connections with -localauth The butc -localauth option is available to use the cell-wide key to authenticate to the vlserver and buserver, which in normal deployments will require incoming connections to be authenticated as a superuser. In such cases, the cell-wide key is also available for use in authenticating incoming connections to the butc, which would otherwise have been completely unauthenticated. Because of the security hazards of allowing unauthenticaed inbound RPCs, especially ones that manipulate backup information and are allowed to initiate outboud RPCs authenticated as the superuser, default to not allowing unauthenticated inbound RPCs at all. Provide an opt-out command-line argument for deployments that require this functionality and have configured their network environment (firewall/etc.) appropriately. Change-Id: Ia6349757a4c6d59d1853df1a844e210d32c14feb commit c43169fd36348783b1a5a55c5bb05317e86eef82 Author: Benjamin Kaduk Date: Sun Sep 9 11:49:03 2018 -0500 OPENAFS-SA-2018-001 Add auditing to butc server RPC implementations Make the actual implementations into helper functions, with the RPC stubs calling the helpers and doing the auditing on the results, akin to most other server programs in the tree. This relies on support for some additional types having been added to the audit framework. Change-Id: Ic872d6dfc7854fa28bd3dc2277e92c7919d0d0c0 commit 9ebff4c6caa8b499d999cfd515d4d45eb3179769 Author: Benjamin Kaduk Date: Sat Sep 8 19:42:36 2018 -0500 OPENAFS-SA-2018-001 audit: support butc types Add support for several complex butc types to enable butc auditing. Change-Id: I6aedd933cf5330cda40aae6f33827ae65409df32 commit 50216dbbc30ed94f89bdd0e964f4891e87f28c0b Author: Benjamin Kaduk Date: Sat Sep 8 20:35:25 2018 -0500 OPENAFS-SA-2018-001 butc: remove dummy osi_audit() routine This local stub was present in the original IBM import and is unused. It will conflict with the real audit code once we start adding auditing to the TC_ RPCs, so remove it now. Change-Id: I3e74e01464af122f245c3b0fe8f3985e422d13b4 commit a4c1d5c48deca2ebf78b1c90310b6d56b3d48af6 Author: Mark Vitale Date: Fri Jul 6 03:14:19 2018 -0400 OPENAFS-SA-2018-003 rxgen: prevent unbounded input arrays RPCs with unbounded arrays as inputs are susceptible to remote denial-of-service (DOS) attacks. A malicious client may submit an RPC request with an arbitrarily large array, forcing the server to expend large amounts of network bandwidth, cpu cycles, and heap memory to unmarshal the input. Instead, issue an error message and stop rxgen when it detects an RPC defined with an unbounded input array. Thus we will detect the problem at build time and prevent any future unbounded input arrays. Change-Id: Ib110f817ed1c8132ea2549025876a5200c728fab commit 8b92d015ccdfcb70c7acfc38e330a0475a1fbe28 Author: Mark Vitale Date: Fri Jul 6 03:21:26 2018 -0400 OPENAFS-SA-2018-003 volser: prevent unbounded input to various AFSVol* RPCs Several AFSVol* RPCs are defined with an unbounded XDR "string" as input. RPCs with unbounded arrays as inputs are susceptible to remote denial-of-service (DOS) attacks. A malicious client may submit an AFSVol* request with an arbitrarily large string, forcing the volserver to expend large amounts of network bandwidth, cpu cycles, and heap memory to unmarshal the input. Instead, give each input "string" an appropriate size. Volume names are inherently capped to 32 octets (including trailing NUL) by the protocol, but there is less clearly a hard limit on partition names. The Vol_PartitionInfo{,64} functions accept a partition name as input and also return a partition name in the output structure; the output values have wire-protocol limits, so larger values could not be retrieved by clients, but for denial-of-service purposes, a more generic PATH_MAX-like value seems appropriate. We have several varying sources of such a limit in the tree, but pick 4k as the least-restrictive. [kaduk@mit.edu: use a larger limit for pathnames and expand on PATH_MAX in commit message] Change-Id: Iea4b24d1bb3570d4c422dd0c3247cd38cdbf4bab commit 97b0ee4d9c9d069e78af2e046c7987aa4d3f9844 Author: Mark Vitale Date: Fri Jul 6 01:09:53 2018 -0400 OPENAFS-SA-2018-003 volser: prevent unbounded input to AFSVolForwardMultiple AFSVolForwardMultiple is defined with an input parameter that is defined to XDR as an unbounded array of replica structs: typedef replica manyDests<>; RPCs with unbounded arrays as inputs are susceptible to remote denial-of-service (DOS) attacks. A malicious client may submit an AFSVolForwardMultiple request with an arbitrarily large array, forcing the volserver to expend large amounts of network bandwidth, cpu cycles, and heap memory to unmarshal the input. Even though AFSVolForwardMultiple requires superuser authorization, this attack is exploitable by non-authorized actors because XDR unmarshalling happens long before any authorization checks can occur. Add a bounding constant (NMAXNSERVERS 13) to the manyDests input array. This constant is derived from the current OpenAFS vldb implementation, which is limited to 13 replica sites for a given volume by the layout (size) of the serverNumber, serverPartition, and serverFlags fields. [kaduk@mit.edu: explain why this constant is used] Change-Id: Id12c6a7da4894ec490691eb8791dcd3574baa416 commit 124445c0c47994f5e2efef30e86337c3c8ebc93f Author: Mark Vitale Date: Thu Jul 5 23:51:37 2018 -0400 OPENAFS-SA-2018-003 budb: prevent unbounded input to BUDB_SaveText BUDB_SaveText is defined with an input parameter that is defined to XDR as an unbounded array of chars: typedef char charListT<>; RPCs with unbounded arrays as inputs are susceptible to remote denial-of-service (DOS) attacks. A malicious client may submit a BUDB_SaveText request with an arbitrarily large array, forcing the budb server to expend large amounts of network bandwidth, cpu cycles, and heap memory to unmarshal the input. Modify the XDR definition of charListT so it is bounded. This typedef is shared (as an OUT parameter) by BUDB_GetText and BUDB_DumpDB, but fortunately all in-tree callers of the client routines specify the same maximum length of 1024. Note: However, SBUDB_SaveText server implementation seems to allow for up to BLOCK_DATA_SIZE (2040) = BLOCKSIZE (2048) - sizeof(struct blockHeader) (8), and it's unknown if any out-of-tree callers exist. Since we do not need a tight bound in order to avoid the DoS, use a somewhat higher maximum of 4096 bytes to leave a safety margin. [kaduk@mit.edu: bump the margin to 4096; adjust commit message to match] Change-Id: Ic3fe2758a9c97ed02c6e6d05f0de0865959b5b04 commit 7629209219bbea3f127b33be06ac427ebc3a559e Author: Mark Vitale Date: Thu Jul 5 21:11:30 2018 -0400 OPENAFS-SA-2018-003 vlserver: prevent unbounded input to VL_RegisterAddrs VL_RegisterAddrs is defined with an input argument of type bulkaddrs, which is defined to XDR as an unbounded array of afs_uint32 (IPv4 addresses): typedef afs_uint32 bulkaddrs<> The <> with no value instructs rxgen to build client and server stubs that allow for a maximum size of "~0u" or 0xFFFFFFFF. Ostensibly the bulkaddrs array is unbounded to allow it to be shared among VL_RegisterAddrs, VL_GetAddrs, and VL_GetAddrsU. The VL_GetAddrs* RPCs use bulkaddrs as an output array with a maximum size of MAXSERVERID (254). VL_RegisterAddrss uses bulkaddrs as an input array, with a nominal size of VL_MAXIPADDRS_PERMH (16). However, RPCs with unbounded array inputs are susceptible to remote denial-of-service attacks. That is, a malicious client may send a VL_RegisterAddrs request with an arbitrarily long array, forcing the vlserver to expend large amounts of network bandwidth, cpu cycles, and heap memory to unmarshal the argument. Even though VL_RegisterAddrs requires superuser authorization, this attack is exploitable by non-authorized actors because XDR unmarshalling happens long before any authorization checks can occur. Because all uses of the type that our implementation support have fixed bounds on valid data (whether input or output), apply an arbitrary implementation limit (larger than any valid structure would be), to prevent this class of attacks in the XDR decoder. [kaduk@mit.edu: limit the bulkaddrs type instead of introducing a new type] Change-Id: Ibcc962ccc46aec7552b86d1d9fda7cc14310bc03 commit f5a80115f8f7f9418287547f0fc7fdb13d936f00 Author: Benjamin Kaduk Date: Thu Aug 30 10:38:56 2018 -0500 OPENAFS-SA-2018-002 butc: Initialize OUT scalar value In STC_ReadLabel, the interaction with the tape device is synchronous, so there is no need to allocate a task ID for status monitoring. However, we do need to initialize the output value, to avoid writing stack garbage on the wire. Change-Id: Id2066e1fe95fa1de02577dfd844697b1ae770f30 commit 7a7c1f751cdb06c0d95339c999b2c035c2d2168b Author: Mark Vitale Date: Tue Jun 26 06:01:16 2018 -0400 OPENAFS-SA-2018-002 ubik: prevent VOTE_Debug, VOTE_XDebug information leak VOTE_Debug and VOTE_XDebug (udebug) both leave a single field uninitialized if there is no current transaction. This leaks the memory contents of the ubik server over the wire. struct ubik_debug - 4 bytes in member writeTrans In common code to both RPCs, ensure that writeTrans is always initialized. [kaduk@mit.edu: switch to memset] Change-Id: I91184b4ed0c159982a883ebaa9634406400eae93 commit b604ee7add7be416bf20973422a041e913d20761 Author: Mark Vitale Date: Tue Jun 26 05:26:21 2018 -0400 OPENAFS-SA-2018-002 kaserver: prevent KAM_ListEntry information leak KAM_ListEntry (kas list) does not initialize its output correctly. It leaks kaserver memory contents over the wire: struct kaindex - up to 64 bytes for member name - up to 64 bytes for member instance Initialize the buffer. [kaduk@mit.edu: move initialization to top of server routine] Change-Id: I5cc430fc996e7e89d38a384d092b9d4fad248fa4 commit be0142707ca54f3de99c4886530e7ac9f48dd61c Author: Mark Vitale Date: Tue Jun 26 05:12:32 2018 -0400 OPENAFS-SA-2018-002 butc: prevent TC_DumpStatus, TC_ScanStatus information leaks TC_ScanStatus (backup status) and TC_GetStatus (internal backup status watcher) do not initialize their output buffers. They leak memory contents over the wire: struct tciStatusS - up to 64 bytes in member taskName (TC_MAXNAMELEN 64) - up to 64 bytes in member volumeName " Initialize the buffers. [kaduk@mit.edu: move initialization to top of server routines] Change-Id: I0337d233e1dced56e351ed00471c9738fcd3b9db commit 52f4d63148323e7d605f9194ff8c1549756e654b Author: Mark Vitale Date: Tue Jun 26 05:00:25 2018 -0400 OPENAFS-SA-2018-002 butc: prevent TC_ReadLabel information leak TC_ReadLabel (backup readlabel) does not initialize its output buffer completely. It leaks butc memory contents over the wire: struct tc_tapeLabel - up to 32 bytes from member afsname (TC_MAXTAPELEN 32) - up to 32 bytes from member pname (TC_MAXTAPELEN 32) Initialize the buffer. [kaduk@mit.edu: move initialization to the RPC stub] Change-Id: I30f4aa32801791913b397a58c36c86c019dc51ef commit e96771471134102d3879a0ac8b2c4ef9d91a61b8 Author: Mark Vitale Date: Tue Jun 26 04:39:44 2018 -0400 OPENAFS-SA-2018-002 budb: prevent BUDB_* information leaks The following budb RPCs do not initialize their output correctly. This leaks buserver memory contents over the wire: BUDB_FindLatestDump (backup dump) BUDB_FindDump (backup volrestore, diskrestore, volsetrestore) BUDB_GetDumps (backup dumpinfo) BUDB_FindLastTape (backup dump) struct budb_dumpEntry - up to 32 bytes in member volumeSetName - up to 256 bytes in member dumpPath - up to 32 bytes in member name - up to 32 bytes in member tape.tapeServer - up to 32 bytes in member tape.format - up to 256 bytes in member dumper.name - up to 128 bytes in member dumper.instance - up to 256 bytes in member dumper.cell Initialize the buffer in common routine FillDumpEntry. Change-Id: Ic057a6c906ce2acd39e0e4ea0a0ba1e100bba3e9 commit 211b6d6a4307006da1467b3be46912a3a5d7b20b Author: Mark Vitale Date: Tue Jun 26 03:56:24 2018 -0400 OPENAFS-SA-2018-002 afs: prevent RXAFSCB_TellMeAboutYourself information leak RXAFSCB_TellMeAboutYourself does not completely initialize its output buffers. This leaks kernel memory over the wire: struct interfaceAddr Unix cache manager (libafs) - up to 124 bytes in array addr_in ((AFS_MAX_INTERFACE_ADDR 32 * 4) - 4)) - up to 124 bytes in array subnetmask " - up to 124 bytes in array mtu " Windows cache manager - 64 bytes in array addr_in ((AFS_MAX_INTERFACE_ADDR 32 - CM_MAXINTERFACE_ADDR 16)* 4) - 64 bytes in array subnetmask " - 64 bytes in array mtu " The following implementations of SRXAFSCB_TellMeAboutYourself are not susceptible: - fsprobe - libafscp - xstat_fs_test Initialize the buffer. Change-Id: I2ef868dd9269db7004a21cf913b6787948357d10 commit b52eb11a08f2ad786238434141987da27b81e743 Author: Mark Vitale Date: Tue Jun 26 03:47:41 2018 -0400 OPENAFS-SA-2018-002 afs: prevent RXAFSCB_GetLock information leak RXAFSCB_GetLock (cmdebug) does not correctly initialize its output. This leaks kernel memory over the wire: struct AFSDBLock - up to 14 bytes for member name (16 - '\0') Initialize the buffer. Change-Id: I4c5c8d67816c51645c0db44dc8f19b1b27c02757 commit 9d1aeb5d761581a35bef2042e9116b96e9ae3bf5 Author: Mark Vitale Date: Tue Jun 26 03:37:37 2018 -0400 OPENAFS-SA-2018-002 ptserver: prevent PR_ListEntries information leak PR_ListEntries (pts listentries) does not properly initialize its output buffers. This leaks ptserver memory over the wire: struct prlistentries - up to 62 bytes for each entry name (PR_MAXNAMELEN 64 - 'a\0') Initialize the buffer, and remove the now redundant memset for the reserved fields. Change-Id: I29d70c7e4dd567b8b046037f29f71911b8a0593f commit 26924fd508b21bb6145e77dc31b6cd0923193b72 Author: Mark Vitale Date: Tue Jun 26 03:00:02 2018 -0400 OPENAFS-SA-2018-002 volser: prevent AFSVolMonitor information leak AFSVolMonitor (vos status) does not properly initialize its output buffers. This leaks information from volserver memory: struct transDebugInfo - up to 29 bytes in member lastProcName (30-'\0') - 16 bytes in members readNext, tranmitNext, lastSendTime, lastReceiveTime Initialize the buffers. This must be done on a per-buffer basis inside the loop, since realloc is used to expand the storage if needed, and there is not a standard realloc API to zero the newly allocated storage. [kaduk@mit.edu: update commit message] Change-Id: I79091fc63435ed2a795955f95bb867bc625ad398 commit 76e62c1de868c2b2e3cc56a35474e15dc4cc1551 Author: Mark Vitale Date: Tue Jun 26 02:33:05 2018 -0400 OPENAFS-SA-2018-002 volser: prevent AFSVolPartitionInfo(64) information leak AFSVolPartitionInfo and AFSVolPartitionInfo64 (vos partinfo) do not properly initialize their reply buffers. This leaks the contents of volserver memory over the wire: AFSVolPartitionInfo (struct diskPartition) - up to 24 bytes in member name (32-'/vicepa\0')) - up to 12 bytes in member devName (32-'/vicepa/Lock/vicepa\0')) AFSVolPartitionInfo64 (struct diskPartition64) - up to 248 bytes in member name (256-'/vicepa\0')) - up to 236 bytes in member devName (256-'/vicepa/Lock/vicepa\0') Initialize the output buffers. [kaduk@mit.edu: move memset to top-level function scope of RPC handlers] Change-Id: If64c02f36f10f52bfbab4b21ad1f60032c223c82 commit 70b0136d552a0077d3fae68f3aebacd985abd522 Author: Mark Vitale Date: Mon Jun 25 18:03:12 2018 -0400 OPENAFS-SA-2018-002 ptserver: prevent PR_IDToName information leak SPR_IDToName does not completely initialize the return array of names, and thus leaks information from ptserver memory: - up to 62 bytes per requested id (PR_MAXNAMELEN 64 - 'a\0') Use calloc to ensure that all memory sent on the wire is initialized, preventing the information leak. [kaduk@mit.edu: switch to calloc; update commit message] Change-Id: Iad623f2cc4c54b79f14a64b8714ba12579d05447 commit 03e804b629c17ca7a4e5789cf98b283c52bd59ed Author: Ben Kaduk Date: Thu Jan 10 11:57:00 2013 -0500 Configure glue for rxgk Add an --enable-rxgk switch to control whether the feature is used. For the sake of buildbot coverage, we still attempt to build the core subdirectory provided that a sufficiently usable GSS-API library is available, but do not install anything when rxgk is disabled at configure time. Future commits will use the configure argument to control the behavior of other rxgk-aware code in the tree. We provide a few new symbols to conditionally compile code for rxgk. The two new high-level symbols are: - AFS_RXGK_ENV: when defined, rxgk is available - AFS_RXGK_GSS_ENV: when defined, we can use GSS-API calls AFS_RXGK_GSS_ENV is turned on only for userspace pthread builds. For now, AFS_RXGK_ENV is only turned on for userspace pthread builds, and non-ukernel kernel builds. This effectively disables rxgk integration in any ukernel or LWP code, but this can be changed in the future by changing when AFS_RXGK_ENV is defined. Change-Id: Iab661d47aac77c1a238e809362015b869752df18 Reviewed-on: https://gerrit.openafs.org/10564 Reviewed-by: Benjamin Kaduk Reviewed-by: Andrew Deason Tested-by: BuildBot commit de43a0f8829e26b2c56347176d7938810a38469c Author: Michael Meffie Date: Thu Apr 12 23:18:55 2018 -0400 Suppress statement not reached warnings after noreturn functions Use the AFS_UNREACHED macro to suppress statement not reached warnings while building under Solaris Studio. These warnings are emitted for statements following functions declared with the noreturn function attribute. Change-Id: Ic18cbb3ea78124acbe69edc0eccb2473b46648fe Reviewed-on: https://gerrit.openafs.org/13010 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit b7e9a9b28aaa024b6d6efc6ca74edc690500fc0d Author: Michael Meffie Date: Tue Apr 10 18:29:44 2018 -0400 lwp: add missing lwp prototypes for solaris Add missing lwp function prototypes for Solaris. This fixes the compile time warning messages: warning: implicit function declaration: LWP_NoYieldSignal Change-Id: I69c3660bb2631215cd296c08729c8e84d60660fd Reviewed-on: https://gerrit.openafs.org/13008 Reviewed-by: Benjamin Kaduk Reviewed-by: Michael Meffie Tested-by: BuildBot commit ae4ad509d35aab73936a1999410bd80bcd711393 Author: Michael Meffie Date: Fri Jan 19 03:30:22 2018 -0500 rx: fix rx_atomic warnings under Solaris The Solaris implementation of the rx_atomic functions generate numerous complile time warnings due to an integer type mismatch. "rx_atomic.h", line xxx: warning: argument #1 is incompatible with prototype: The rx_atomic_t is an unsigned int under Solaris, however the Solaris atomic_set_long_excl and atomic_clear_long_excl functions take a ulong_t type Solaris does not provide 'unsigned int' variants of these two functions. Fortunately, ulong_t variants of all the atomic we need for rx are available, in current as well as older versions of Solaris, so convert the Solaris rx_atomic_t type to be a ulong_t and convert all of the Solaris atomic calls to the ulong_t variants to avoid integer type mismatches. Change-Id: Ib54ca4bb8b9f044684301f0fb7971aec223e5993 Reviewed-on: https://gerrit.openafs.org/12991 Reviewed-by: Andrew Deason Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk Reviewed-by: Michael Meffie Tested-by: BuildBot commit 3915911bcea2ede55799a15cec614e8291952e1f Author: Michael Meffie Date: Thu Aug 9 16:24:41 2018 -0400 afs: declare nfs translator dispatch functions static Declare the nfs translator dispatch functions to be static to enforce they are not to be called from outside of the translator. Change-Id: I1c3d8917c080409424e21e377405472094941da0 Reviewed-on: https://gerrit.openafs.org/13277 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit f8672d0c0f6f58e773ce0e6e4b2fc7b19a5e7ffe Author: Michael Meffie Date: Thu Mar 29 23:36:21 2018 -0400 afs: use void * for generic pointers in the nfs translator dispatcher Replace the use of char * and char ** with void * for representing generic pointers in the nfs dispatcher functions. This was done to fix a large number of compile time warnings, and allows us to remove a number of explicit casts. Also, remove the unnecessary char * casts of memset and memcpy arguments in the nfs translator dispatcher. This commit fixes a large number of Solaris Studio warning messages in the form: ... warning: argument #X is incompatible with prototype: Change-Id: I42e2d40b8112ada9417724282c0230f48a40324f Reviewed-on: https://gerrit.openafs.org/12989 Reviewed-by: Andrew Deason Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit dc2141bf56b43a9531335f581767d7766895b8d2 Author: Michael Meffie Date: Thu Mar 29 23:32:40 2018 -0400 afs: change afs_nfs{2,3}_dispatcher signature The fourth argument of the afs_nfs{2,3}_dispatcher functions is a pointer to a pointer to a exportinfo structure. However, this argument is not an output argument, so the extra level of indirection is unnecessary. A separate local variable is used as an output argument to the afs_nfsclient_reqhandler call within the dispatchers, which is not passed back to the afs_nfs{2,3}_dispatcher caller. In anticipation of other changes to fix warning messages, simplify the signature of the afs_nfs{2,3}_dispatcher functions to avoid taking the address of the exportinfo structure when calling afs_nfs{2,3}_dispatcher. Change-Id: I6fb1a190e6aab286bfac41df783688a0be46a21f Reviewed-on: https://gerrit.openafs.org/12988 Reviewed-by: Andrew Deason Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit e7678fb5fb6725055b576b86f6ef994594f0bb92 Author: Michael Meffie Date: Thu Mar 29 23:15:47 2018 -0400 afs: fix missing afs_nfs3_dispatcher return value Fix a missing early return value in the function afs_nfs3_dispatcher. All callers check the return code of afs_nfs3_dispatcher and interpret values greater that 1 to be errors. Return 3 as an error code for this code path, which is the next available error code in afs_nfs3_dispatcher. This commit fixes the following Solaris Studio warning message: ... warning: function expects to return value: afs_nfs3_dispatcher Change-Id: I47b545bd57a46c03006b9f031da3647c8a530377 Reviewed-on: https://gerrit.openafs.org/12987 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Reviewed-by: Michael Meffie Tested-by: BuildBot commit 388eaec3452ed4b18a95ee34efcbe4cf64814701 Author: Michael Meffie Date: Thu Mar 15 18:53:59 2018 -0400 roken: do not clobber __attribute__ The roken-common.h header defines an empty macro called __attribute__ when HAVE___ATTRIBUTE__ is not defined. This macro conditionally removes the `format' function attributes in the roken headers at compile time. Unfortunately, the empty __attribute__ macro will also clobber other attribute types encountered after the roken.h header inclusion. This is not an issue when building under gcc or clang, since the empty attribute macro will not be defined. However Solaris Studio supports a subset of the function attribute types, with `format' not currently supported. This means roken will define an empty __attribute__ macro, which prevents the use of other attribute types. This commit does not change the roken files directly because they are external. Instead, the processing of the roken.h.in file has been updated to undefine the __attribute__ macro at the end of the generated roken.h header. Change-Id: Iea5622ae175e7f82a60780838948178bd7f8b56f Reviewed-on: https://gerrit.openafs.org/12961 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 1711917e7ded7ebebae74d7bfeb8359a69db8869 Author: Andrew Deason Date: Fri Jun 29 14:48:58 2018 -0500 autoconf: Split out krb5/gss tests Move our krb5 and GSS-related autoconf tests into their own separate files, in src/cf/krb5.m4 and src/cf/gss.m4. Change-Id: I4202df5d810f2d3942fc4ffb3fd406869f68029b Reviewed-on: https://gerrit.openafs.org/13237 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 9d3ef9337fafe5dcf3865d3aced290be0f887c11 Author: Marcio Barbosa Date: Thu May 31 09:46:56 2018 -0300 autoconf: do not reference the missing script Currently, OpenAFS does not use automake. As a result, the missing script is not copied to the build-tools directory. Since this script is not present in the tree, am_missing_run is not initialized. Unfortunately, the current version still has a few references to this variable. In order to preserve a similar behavior, this commit replaces these references by AC_ERROR. While we are changing these, remove the AC_CHECK_PROGS calls for AR and STRIP, since libtool already checks these for us. Change-Id: I833dc6e8611dc7227db4ec77b0160dfa47b7e531 Reviewed-on: https://gerrit.openafs.org/12982 Reviewed-by: Andrew Deason Reviewed-by: Michael Meffie Reviewed-by: Mark Vitale Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit a9644daa965fbf316943a07ad985b8ead2f4f31d Author: Peter Foley Date: Mon Feb 29 16:39:14 2016 -0500 Remove obsolete retsigtype Only relevent for pre-c89 K&R compilers. [mmeffie@sinenomine.net: avoid changes to src/external] Change-Id: I1b3bf14ddd50f1a6b3d50e0376abffffdb64fb81 Reviewed-on: https://gerrit.openafs.org/12203 Reviewed-by: Michael Meffie Reviewed-by: Andrew Deason Reviewed-by: Marcio Brito Barbosa Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 451602a5e3a503d46eaecb3738d259e46023afcd Author: Michael Meffie Date: Sat May 26 19:52:27 2018 -0400 autoconf: reformat long lines The autoupdate tool was run to modernize the autoconf macros but generates very long lines. Manually reformat the long lines to make them more reasonable. Change-Id: I6f08138aa7134d8110da885ea4375cebbe903575 Reviewed-on: https://gerrit.openafs.org/13125 Reviewed-by: Michael Meffie Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk commit 2e23fceec872795a39b915b73e48eb77a5d65afe Author: Peter Foley Date: Mon Feb 29 13:28:28 2016 -0500 autoconf: autoupdate macros Run autoupdate on macros. [mmeffie@sinenomine.net: re-run autoupdate, no other edits] Change-Id: I8b45edea97cf2e065f23f02d2d7f6a0e7adcb8a5 Reviewed-on: https://gerrit.openafs.org/12202 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Andrew Deason Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk commit f9c584a794c6a4c5d03fa1ee7f1b2b5e1309e7ee Author: Michael Meffie Date: Fri Apr 20 11:47:57 2018 -0400 autoconf: update curses.m4 Replace the obsolete AC_TRY_COMPILE with AC_COMPILE_IFELSE/AC_LANG_PROGRAM in the curses check for the getmaxyx macro. This change was done manually instead of using autoupdate because the program prologue argument for this particular check is an m4 macro, which will not expand to code when autoupdate adds m4 quotes to the AC_LANG_PROGRAM arguments. Change-Id: I85b65fb9b59b45d31286436a9f15110cec31bec8 Reviewed-on: https://gerrit.openafs.org/13021 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Andrew Deason commit c5def62d7be4891f534b753374acbf5b524701eb Author: Michael Meffie Date: Mon Apr 16 10:42:49 2018 -0400 autoconf: update pthread checks Replace obsolete AC_TRY_COMPILE with AC_COMPILE_IFELSE. Replace shell if/then conditionals with AS_IF macros. Reformat indentation and quoting. This change was done manually, since autoupdate copes poorly with the old, nested AC_TRY_COMPILE macros. Change-Id: I2c34d1426f154daff65999076821f49ddaa16a24 Reviewed-on: https://gerrit.openafs.org/13018 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Andrew Deason Reviewed-by: Marcio Brito Barbosa commit 4706854f57043c8393baa922dd1974176e110a19 Author: Peter Foley Date: Mon Feb 29 13:19:01 2016 -0500 autoconf: updates and cleanup Update autoconf macros to their modern equivalents, according to what the 'autoupdate' tool does. While we're here, remove automake references that aren't being used, and remove the obsolete AC_PROG_LIBTOOL in favor of AFS_LT_INIT. Change-Id: I71066d6d72f8b1d8663e26fec83ae23d7f73f059 Reviewed-on: https://gerrit.openafs.org/12199 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit e01053e04a207bc0a7cf07cc9924e37450540fb4 Author: Michael Meffie Date: Thu Jan 25 18:27:00 2018 -0500 SOLARIS: suppress -xarch=amd64 is deprecated warnings The -m64 flag to specify 64bit builds was introduced in Sun Studio 10, circa 2005. The old flag -xarch=amd64 was deprecated as of Sun Studio 12, circa 2007. Ever since Sun Studio 12, the compiler complains with a warning message when the old -xarch=amd64 flag is given: cc: Warning: -xarch=amd64 is deprecated, use -m64 to create 64-bit programs Update the cflags when building the Solaris kernel module for x86 to use the modern -m64 under Solaris 11 or later. Since Solaris 11 has been available since 2010, it is very unlikely a compiler on Solaris 11 would not support the modern -m64 flag. Change-Id: Ib13c00f1c69f34ab1905a8dd4a46c90895046f25 Reviewed-on: https://gerrit.openafs.org/12959 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit cc1724e6f5a8f485197aba6246c909869e58d0b2 Author: Perry Ruiter Date: Thu Apr 23 21:33:27 2015 -0700 afsd: Improve syscall tracing When afsd is started with the -debug flag, extensive debug output is generated including tracing for each syscall. Unfortunately the existing syscall tracing is not especially helpful. It dumps out two constants that we already knew at compile time, the first parameter of the syscall along with the syscall's return code. Specifically it does not tell you which syscall is currently being traced. Here's a current example of afsd -debug: afsd: cacheFiles autotuned to 581250 afsd: dCacheSize autotuned to 10000 afsd: cacheStatEntries autotuned to 15000 SScall(183, 28, 6860800)=0 SScall(183, 28, -847416368)=0 SScall(183, 28, 1)=0 afsd: Forking rx listener daemon. afsd: Forking rx callback listener. afsd: Forking rxevent daemon. SScall(183, 28, 0)=0 SScall(183, 28, 1)=0 ... This patch drops the compile time constants (183 and 28 in the above sample output) and replaces them with the name of the syscall being traced. Additionally the first parameter to a syscall is as likely to be an address as a decimal value so display it in hex. Here's an example of afsd -debug with these changes: afsd: cacheFiles autotuned to 581250 afsd: dCacheSize autotuned to 10000 afsd: cacheStatEntries autotuned to 15000 os_syscall(AFSOP_SET_THISCELL, 0x68bf80)=0 os_syscall(AFSOP_SEED_ENTROPY, 0x7fff9ce40c10)=0 os_syscall(AFSOP_ADVISEADDR, 0x1)=0 afsd: Forking rx listener daemon. afsd: Forking rx callback listener. afsd: Forking rxevent daemon. os_syscall(AFSOP_RXEVENT_DAEMON, 0x0)=0 os_syscall(AFSOP_BASIC_INIT, 0x1)=0 ... [mmeffie@sinenomine.net: avoid c99 array initialization.] Change-Id: I4f3d46d420d19abeddbf719efa04aef7e553d51f Reviewed-on: https://gerrit.openafs.org/11858 Tested-by: Mark Vitale Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 1f29c9f05f53966df1bbd9ece479155f78f995e0 Author: Michael Meffie Date: Fri Mar 16 20:51:42 2018 -0400 autoconf: attribute type checks Check for function attributes by type and update src/afs/stds.h to conditionally include the attributes detected, instead of checking for specific compilers and compiler versions. This allows attributes to be used when building under Solaris Studio. Change-Id: I8a4dbc1b2cb6032d28176349481085bf6deb309c Reviewed-on: https://gerrit.openafs.org/12963 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 8ecc4976b83a034263b348d1b001dda378b26932 Author: Michael Meffie Date: Thu Aug 9 15:18:50 2018 -0400 opr: avoid empty nonnull argument index lists Commit 71dc077831d339fc5822f2c2c79b65afe14b12f8 changed the AFS_NONULL macro in opr.h to fix a build error on windows by adding an empty argument index list. However, Solaris compilers do not support empty parameter lists. Specify the argument index to allow so nonnull function attributes can be supported on Solaris. Change-Id: I3e629868374eb6484923c253da2cdd1d8eacdb2f Reviewed-on: https://gerrit.openafs.org/13276 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit f9b3cf888304d42c2a1a8472fdeeab68a7347859 Author: Michael Meffie Date: Sun Jan 14 09:38:26 2018 -0500 autoconf: check for format __attribute__ to avoid warnings Building with Solaris Studio generates a ludicrous number of warnings in the form: roken.h, line ...: warning: attribute "format" is unknown, ignored Modern Solaris Studio supports several GCC-style function attributes, including the `noreturn' attribute, however does not support the `format' attribute. Currently, configure defines HAVE___ATTRIBUTE__ when the `noreturn' attribute is available. roken headers conditionally declare printf-like functions with the `format' function attribute when HAVE___ATTRIBUTE__ is defined, leading to the warning messages when building under Solaris Studio. Unsupported function attributes generate warnings, not errors. Fix these warnings by defining HAVE___ATTRIBUTE__ if and only if the `format' attribute is supported by the compiler, instead of checking for `noreturn'. Note that the `format' type is currently the only attribute used by roken at this time. Change-Id: I569167333d65df2583befc19befa8d719b93d75a Reviewed-on: https://gerrit.openafs.org/12956 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit b818854f19e33315d1b6453b72a55b54d740e976 Author: Michael Meffie Date: Fri Mar 16 20:41:35 2018 -0400 autoconf: import gcc function attribute check macro Import Gabriele Svelto's AC_GCC_FUNC_ATTRIBUTE autoconf macro to check for GCC-style function attributes. This macro is part of the GNU Autoconf Archive[1]. The imported file is distributed under an all-permissive license. [1] https://www.gnu.org/software/autoconf-archive/ Change-Id: I64ccd00717fa9606a26aeeeea9030f4fb4877cf8 Reviewed-on: https://gerrit.openafs.org/12962 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 0da5ac4d9fb2a9b46c7415403a3cd26e711554e2 Author: Andrew Deason Date: Tue Aug 7 17:08:26 2018 -0500 afs: Return memcache allocation errors During cache initialization, we can fail to allocate our dcache entries for memcache. Currently when this happens, we just log a message and try to disable dcache access. However, this results in at least one code path that causes a panic anyway during startup, since afs_CacheTruncateDaemon will try to trim the cache, and afs_GetDownD will call afs_MemGetDSlot, and we cannot find the given dslot. To avoid this, change our cache initialization to return an error, instead of trying to continue without a functional dcache. This causes afs_dcacheInit to return an error in this case, and by extension afs_CacheInit and the AFSOP_CACHEINIT syscall. Also change afsd to actually detect errors from AFSOP_CACHEINIT, and to bail out when it does. Thanks to gsgatlin@ncsu.edu for reporting the relevant panic. Change-Id: Ic89ff9638201faae6c4399a2344d4da3e251d537 Reviewed-on: https://gerrit.openafs.org/13273 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 0bc5c15029cf7e720731f1415fcf9dc972d57ef4 Author: Joe Gorse Date: Mon Jul 2 20:36:04 2018 +0000 LINUX: Update to Linux struct iattr->ia_ctime to timespec64 with 4.18 With 4.18+ Linux kernels we see a transition to 64-bit time stamps by default. current_kernel_time() returns the 32-bit struct timespec. current_kernel_time64() returns the 64-bit struct timespec64. struct iattr->ia_ctime expects struct timespec64 as of 4.18+. Timestamps greater than 31-bit rollover after 2147483647 or January 19, 2038 03:14:07 UTC. This is the same approach taken by the Linux developers for converting between timepsec64 and timespec. Change-Id: Icc1cf5d1a6679f5c749f8720f225a9b293f675fd Reviewed-on: https://gerrit.openafs.org/13241 Reviewed-by: Stephan Wiesand Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit ee66819a0c1a9efa98b76a1c18af6233bda1e233 Author: Andrew Deason Date: Thu Jul 26 17:57:38 2018 -0500 libuafs: Stop clobbering CFLAGS Currently, in the libuafs MakefileProto for every platform, CFLAGS is set to a bunch of flags, ignoring any CFLAGS set by the 'make' command-line provided by the user. Since most of the rest of the tree honors CFLAGS, it is confusing and can cause errors when src/libuafs ignore the user-set CFLAGS. One example of this breaking the build is when building RHEL RPMs for certain sub-architectures of the current machine. If you try to 'rpmbuild --target=i686' on 32-bit x86 RHEL 5, we will build with -march=i686 in the CFLAGS, which will be used to build most objects and is used in our configure tests. As a result, our configure tests will say that gcc atomic intrinsics are available. But when we go to build libuafs objects, we will not have -march=i686 in our CFLAGS, which causes (on RHEL 5) gcc to default to building for i386, which does not have gcc atomic intrinsics available. This causes build errors like this: libuafs.a(rx.o): In function `rx_atomic_test_and_clear_bit': [...]/BUILD/openafs-1.8.0/src/rx/rx_atomic.h:462: undefined reference to `__sync_fetch_and_and_4' To fix this, change the libuafs MakefileProtos to not set CFLAGS directly; instead, set them in a new variable UAFS_CFLAGS. Makefile.common then pulls those flags into MODULE_CFLAGS, which is used in our *_CCRULE build rules. While we are here, also move the common set of CFLAGS set by each platform's MakefileProto into Makefile.common. Now, each MakefileProto only needs to set CFLAGS that are specific to that platform, which ends up being very few (since most platforms were using the exact same set of CFLAGS). Relevant issue identified and analyzed by mbarbosa@sinenomine.net. Change-Id: I1bd21a6e7669137be3e5edee86227fd37f841d62 Reviewed-on: https://gerrit.openafs.org/13262 Reviewed-by: Marcio Brito Barbosa Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit a85aab9dfe7c2ee9e025bc15d849de2dd0a48913 Author: Marcio Barbosa Date: Thu Jul 26 10:30:35 2018 -0700 redhat: actually remove unused AFS::ukernel man page Commit 278581c24a802834719e0d57f27978321556c9bb (redhat: package libuafs perl bindings) added swig as a build dependency on RHEL 6+/Fedora 15+ to build and package AFS::ukernel perl bindings for libuafs. The man page for AFS::ukernel is generated from the pod files unconditionally, so needs to be removed from the staging directories when AFS::ukernel is not packaged. Unfortunately, the full path to the staged AFS::ukernel manpage was not given in that commit, so the rpmbuild will fail on RHEL 5 with the error: RPM build errors: Installed (but unpackaged) file(s) found: /usr/share/man/man3/AFS::ukernel.3.gz Fix this error by specifying the full path to the AFS::ukernel man page to actually remove it when we are not packaging AFS::ukernel files. [mmeffie: updated commit message] Change-Id: If43f083a1014216e2f9a2669bf9e834149a40944 Reviewed-on: https://gerrit.openafs.org/13257 Reviewed-by: Marcio Brito Barbosa Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 9ff5f8f7601cc9761cc6a4ef0e8b7c8c2c8dddb5 Author: Andrew Deason Date: Fri Jul 27 13:36:15 2018 -0500 ubik: Save errno before logging The value of errno can change after a syscall, and ViceLog may issue syscalls (such as write()). So, make sure we save errno here before calling ViceLog(). Issue spotted by kaduk@mit.edu. Change-Id: I0f3308d64cd779bd97c97834ec2b270f5edd7ba6 Reviewed-on: https://gerrit.openafs.org/13263 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 0e1c042615d1aeb919a22568cdd2b2ea42c677ba Author: Mark Vitale Date: Fri May 4 17:32:51 2018 -0400 ubik: improve logging for database synchonizations As an aid for debugging database synchronization issues, ensure that the logging is consistent and unambiguous for both the client and server sides of DISK_GetFile and DISK_SendFile. Add new error messages as required. In addition, rework the "recovery sending version to " message in urecovery_Interact. This message is misleading because the new version database is only sent to a DB server if its version is not up to date. Instead, move this message into the version check block immediately below it. Also reword it for clarity and promote its log level from 5 to 0. Finally, remove the now-superfluous "recovery stating local database" log message. Change-Id: If8bbaa1215cab9fd24b157a0ee57759b34e77e9c Reviewed-on: https://gerrit.openafs.org/13079 Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk commit eac22d3e46c72c0e2b82f35c5187d50b6fa136a2 Author: Mark Vitale Date: Fri Mar 17 18:12:23 2017 -0400 ubik: urecovery_AbortAll diagnostic msgs As a troubleshooting aid for developers, add a few counters and a log msg so we know when transactions are being aborted (if any) by urecovery_AbortAll. Change-Id: I528df6d51acd5d10bb2de30f43b8d4415adc7f8a Reviewed-on: https://gerrit.openafs.org/12618 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot Reviewed-by: Michael Meffie commit 8b0e312d043d435f0e55c6dc14f5446ffedc7ce4 Author: Mark Vitale Date: Mon May 8 21:11:27 2017 -0400 ubik: log important messages at default log level Many important ubik messages (e.g., errors, warnings, sync state changes) are logged at log level 5 (-d 5) or higher. Many sites are reluctant to run ubik servers at a logging level higher than the default due to the large number of extremely noisy informational messages at log level 5. Therefore, many important log messages are never seen. Instead, issue critical errors, warnings, and other important messages at log level 0 so that they are always seen, even at the default logging level. In addition, disambiguate the two "I am no longer sync-site" messages by adding a unique reason text to each. Change-Id: I057edf01e2502e39c5135836f1d0081d03559270 Reviewed-on: https://gerrit.openafs.org/12617 Reviewed-by: Benjamin Kaduk Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Michael Meffie commit 483cad0121d848836b4155817b86231ef21be27a Author: Michael Meffie Date: Fri Jul 6 15:22:36 2018 -0400 vldb_check: write mh entry header flags in network order Commit 6b93ad695e53a86dbe9eea13bd0ff651e1d8c9b7 fixed a false error reported when the vldb contained more than one mh extent blocks. That fix changed the readMH() function to convert the flags field to host byte order of all the mh blocks, not just the first block, in order to check the value of those flags. Unfortunately, that commit missed converting non-zero blocks back to network byte order in the complementary writeMH() function, which is used to write the data back to disk when vldb_check is run with the -fix option. FIXES 134589 Change-Id: I4cdbd57b3336e78a9eb1e543ee6d09b33f5e6153 Reviewed-on: https://gerrit.openafs.org/13245 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 7523397333c0f8c6a08312434968d84b8ff56306 Author: Andrew Deason Date: Fri Jun 29 15:25:48 2018 -0500 afs: Make afs_osi_Free(NULL) a no-op In userspace, we assume that free(NULL) does nothing, which makes certain cleanup code paths simpler. This may or may not be true for our free() abstractions that can run in the kernel (like afs_osi_Free, rxi_Free, etc), which is confusing. To make the higher-level free() abstractions more consistent, change afs_osi_Free to guarantee that passing a NULL pointer does nothing. Change-Id: If7c7011795f66464eeb578eacfc943475b4d59f8 Reviewed-on: https://gerrit.openafs.org/13236 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit e60766286b7a581dcdd14466884ea7fdcae10918 Author: Stephan Wiesand Date: Mon Jul 2 14:05:47 2018 +0200 redhat: parallel builds Parallel builds can be an order of magnitude faster. Add the _smp_mflags macro to all invocations of make in the rpm spec, to make use of all available cores and SMT threads on the build system. This should also help noticing new dependency issues early. Note the macro can be overridden on the rpmbuild command line. Change-Id: Idddf8b867500d1ee73ff51de9d8a173bb4cc8c68 Reviewed-on: https://gerrit.openafs.org/13240 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit ab61bcffefdd0a431a435def193cd9a46e3b8ab6 Author: Stephan Wiesand Date: Mon Jul 2 13:33:20 2018 +0200 redhat: speed up userland-only rpm builds When building with --define "build_modules 0", have configure skip the Linux kernel tests, which are slow and many. Change-Id: Ie318bf4939776c9a3f8594dcdd5be54b446f33dd Reviewed-on: https://gerrit.openafs.org/13239 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit babf419886d687f8359159f35e8b89aff5e166f8 Author: Stephan Wiesand Date: Mon Jul 2 13:28:07 2018 +0200 redhat: package new file include/opr/lock.h Commit 792dd44ac57032a3f2a4743c83c8a0208a08ecec added the installation of include/opr/lock.h, but the rpm spec fails to pick it up, making rpm builds fail. Add the new file to the files list for the -devel package. FIXES 134579 Change-Id: I998f48bd88308d81779dd775b322590eda75d5c8 Reviewed-on: https://gerrit.openafs.org/13238 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 89e80c354c404dedc0e5197f99710db0e5e08767 Author: Andrew Deason Date: Thu Jul 5 17:16:48 2018 -0500 LINUX: Detect NULL page during write_begin In afs_linux_write_begin, we call grab_cache_page_write_begin to get a page to use for writing data when servicing a write into AFS. Under low-memory conditions, this can return NULL if Linux cannot find a free page to use. Currently, we always try to reference the page returned, and so this causes a BUG. To avoid this, check if grab_cache_page_write_begin returns NULL, and just return -ENOMEM, like other callers of grab_cache_page_write_begin do. Linux's fault injection framework is useful for testing code paths like these. The following settings made it possible to somewhat-reliably exercise the relevant code path on a test RHEL7 system: # grep ^ /sys/kernel/debug/fail_page_alloc/* /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:Y /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:N /sys/kernel/debug/fail_page_alloc/interval:1 /sys/kernel/debug/fail_page_alloc/min-order:0 /sys/kernel/debug/fail_page_alloc/probability:100 /sys/kernel/debug/fail_page_alloc/space:90 /sys/kernel/debug/fail_page_alloc/task-filter:Y /sys/kernel/debug/fail_page_alloc/times:-1 [...] Change-Id: I00908658ae43aa3c8e12f2a0b956016d4441016c Reviewed-on: https://gerrit.openafs.org/13242 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit b1ad473be01162fe9b3835544a835c4dcf0fcb35 Author: Mark Vitale Date: Sat Jun 30 17:35:09 2018 -0400 rxevent: prevent negative rx_connection refCount rxi_ChallengeEvent is called directly from rxi_ChallengeOn to start the first challenge; subsequent calls to rxi_ChallengeEvent are from the event handler. When called as an event, we must putConnection the reference held by the event. But when called directly for the first time, the event has not been scheduled yet and so has not taken a reference on the connection. For this case, we must not putConnection or the rx_connection refCount will go negative. One reported symptom of this bug is a fileserver crash with: 'Assertion failed! file rx.c, line 1327.' Introduced by commit 304d758983b499dc568d6ca57b6e92df24b69de8 ('Standardize rx_event usage'). Change-Id: I67122ff84ac9b1b6445ad4005e76e5f8482fd7be Reviewed-on: https://gerrit.openafs.org/13228 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 328590dc5669cae3db6c509871b612b0384ea33d Author: Jeffrey Altman Date: Sat Mar 24 01:22:54 2018 -0400 volser: DoVolDelete returning VNOVOL is success When moving, copying or releasing volumes, do not treat a failure to delete a volume because the volume no longer exists as an error. The volume clone has flags VTDeleteOnSalvage | VTOutOfService assigned to it which means that the fileserver won't attach the volume and volume has its deleteMe field assigned the value of DESTROY_ME. Such a volume will be deleted the next time the salvager scans the partition. Once the transaction is complete the volume might be removed. Change-Id: I0bd38906e3836e0c96f3784a8bd9ad63f5b857c6 Reviewed-on: https://gerrit.openafs.org/12976 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 0322dd56b20b2e2fd6eb7f217964174fb5d25cdd Author: Andrew Deason Date: Thu Jun 28 13:08:47 2018 -0500 afs: Change afs_AllocDCache to return error codes Currently, afs_AllocDCache can fail in 2 different situations: - When we are out of dslots on the free/discard lists - When we encounter an i/o error when trying to traverse the dslot lists But afs_AllocDCache cannot distinguish between these two cases to its caller in any way, since all we have to return is a struct dcache (and so we return NULL on any error). Currently, the caller of afs_AllocDCache in afs_GetDCache is determining which of these cases happened by looking at afs_discardDCList and afs_freeDCList, to see if they look empty. This is not great for at least a couple of reasons: - We are examining afs_discardDCList/afs_freeDCList after we drop afs_xdcache (but while still holding GLOCK) - If afs_discardDCList/afs_freeDCList are somehow changed while afs_AllocDCache is running, we may infer the wrong reason why afs_AllocDCache failed. (currently impossible, but this seems fragile) And in general, this check against afs_discardDCList/afs_freeDCList is rather indirect. It may be easier to follow if afs_AllocDCache just directly returned the reason why it failed. So do that, by changing afs_AllocDCache to return an error code, and providing the struct dcache in an output argument. This involves similiarly changing several called functions in the same way, to return error codes. We only define 2 such error codes with this commit: - ENOSPC, when we are out of free/discrad dslots - EIO, when we encounter a disk i/o error when trying to examine the dslot list Note that this commit should not change any real logic; we're mostly just changing how errors are returned from these various functions. Change-Id: I07cc3d7befdcc98360889f4a2ba01fdc9de50848 Reviewed-on: https://gerrit.openafs.org/13227 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 4ab70de9641807bd06056f0c1ac79550453b9574 Author: Andrew Deason Date: Thu Jun 28 12:50:52 2018 -0500 afs: Make afs_AllocDCache static Nothing using afs_AllocDCache outside of afs_dcache.c. Declare the function static, to ensure that nobody else uses it, and to maybe allow for more compiler optimization. Change-Id: I4e4d1e77e20e853fc20b3d5c5289a5f4124de7a4 Reviewed-on: https://gerrit.openafs.org/13226 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit e14ba54095ea44ca2d6e6833280a201186da91f8 Author: Mark Vitale Date: Fri Mar 17 21:42:31 2017 -0400 ubik: log when a server is marked down, and why In order to better manage voting and recovery, each ubik server tracks (in array ubik_servers) which of its fellow quorum members are 'up' or not. However, ubik currently logs only when a server is "back up"; that is, ubik_server->up transitions from 0 to 1. Add new log messages to identify the time and reason when a server is "marked down" (i.e., ubik_server->up transitions from 1 to 0). Also modify two existing messages to have consistent wording with the new "marked down" messages. Also change them to ViceLog (log level 0) so they will always be logged. Change-Id: I29ee93e96cb7b28b943171d1477671c540a10d78 Reviewed-on: https://gerrit.openafs.org/12616 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 0839a3326858f7d7a0042614710dcf7316bb6018 Author: Mark Vitale Date: Thu Jun 14 14:38:54 2018 -0400 afs: remove dead code afs_CheckLocks has been dead code since openafs-ibm-1_0. No functional change incurred. Change-Id: I9d57cf3bbbddef182fb128f65b04465bfe0fb492 Reviewed-on: https://gerrit.openafs.org/13210 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit b4b50118d889999042e23507df6eab6eb164b38b Author: Mark Vitale Date: Thu Jun 14 14:03:45 2018 -0400 vol: remove dead code PartitionID has been dead code since openafs-ibm-1_0. No functional change incurred. Change-Id: I93da25ef853716db7a0b7f945f8b19a15a055a43 Reviewed-on: https://gerrit.openafs.org/13209 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 9d0b2698ac7ab8bb689f30d819bbef08c05a8bf7 Author: Benjamin Kaduk Date: Fri Jun 15 09:07:04 2018 -0500 Comment out missing comerr functions from afsauthent.def Apparently commit 70c4922980d1596155b4021cd72d6895c2371e23 was overzealous in making Windows match Unix, as these functions are not available in the Windows build. Change-Id: Ia24430e5069cd61c0557a07d1bd2c35a6872db8c Reviewed-on: https://gerrit.openafs.org/13219 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 907e09ff2b7e86005765a594db27e1df194ec204 Author: Benjamin Kaduk Date: Fri Jun 15 08:39:47 2018 -0500 Comment out opr_AssertionFailed from afsrpc.def Apparently the Windows utilities link opr.lib directly, so this caused a "multiply defined symbol" error. Change-Id: I0499f789a493960b99052e00763703698b3f9517 Reviewed-on: https://gerrit.openafs.org/13216 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 94f1c1e2a7125e93ed49de31522be806af28626b Author: Benjamin Kaduk Date: Fri Jun 15 08:16:26 2018 -0500 Comment out (again!) xdr_Capabilities from afsrpc.def This shows up as an "unresolved external" when linking (though apparently this error does not cause a buildbot failure), noticed when viewing a related windows build log. Change-Id: I8bd5e344c1b0e12e0c70e0340bacbc6a94984767 Reviewed-on: https://gerrit.openafs.org/13215 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 472d6b1ee2f7de415e0fa0f8be0636f86956b6fc Author: Michael Meffie Date: Thu Jun 14 15:01:18 2018 -0400 ubik: do not assign variables in logging argument lists Several logging statements in ubik contain an assignment statement within the logging function call argument list, which would set a variable as side effect of evaluating the function call arguments. These embedded assignments are problematic since the logging function calls have been replaced by ViceLog macros, which avoid the overhead of a function call depending on logging levels. Remove the embedded assignments within the logging argument lists so the variables are always set regardless of the logging level. Change-Id: Ifc0f32df2d01f9d8105b49e2c56a95758b184449 Reviewed-on: https://gerrit.openafs.org/13211 Tested-by: BuildBot Reviewed-by: Joe Gorse Reviewed-by: Andrew Deason Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk commit e08b9c8d36da3f37efabfb3f94476108a5985d23 Author: Benjamin Kaduk Date: Thu Jun 14 20:37:46 2018 -0500 Remove the unused opr_AssertFailU() function Change-Id: Idb55adeea508d3376269bce998eb8b1c3e4cbd59 Reviewed-on: https://gerrit.openafs.org/13213 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 691757576fb6d60a34fef2c4bc50ae581b65ad76 Author: Benjamin Kaduk Date: Thu Jun 14 20:35:46 2018 -0500 Un-export opr_AssertFailU It appears to have been created for parity with osi_AssertFailU, but was then never used. It is safe to remove the export line, since this export has never been in a released version of OpenAFS. Change-Id: Ia0bdaec891450fe9a3ca10badcaba68bea27c466 Reviewed-on: https://gerrit.openafs.org/13212 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 14da55719c9e9bff1f8e7e02c8a8d47c59fb7b4a Author: Pat Riehecky Date: Wed Jun 6 11:10:25 2018 -0500 mcas: Make sure 'padding' is null-terminated With 'padding' explicitly filled with all spaces string copy operations may result in unexpected values. Padding is extended by 1 and null terminated to avoid unexpected behavior. (via cppcheck) Change-Id: I8a9845ae87002018705ad23c2b089c8ef571b7bc Reviewed-on: https://gerrit.openafs.org/13164 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 6e7db633efad1c88bb300089e3bd4c9feaea5f23 Author: Benjamin Kaduk Date: Thu May 31 19:02:18 2018 -0500 libafsrpc: export more xdr functions Most of the xdr functions in the library text are to support RXAFS and RXAFSCB RPCs, which we explicitly do not expose from libafsrpc. As such, they do not need to be in the export list, but a couple of generic ones probably should be exported. Do so, for both Unix and Windows. Change-Id: I12ddf2427d807f4ee7b07af1e1c498fc119a0f1c Reviewed-on: https://gerrit.openafs.org/13139 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 0b1edd96ac7a952148ec14f8baaf60c8d8bbc04f Author: Benjamin Kaduk Date: Thu May 31 19:00:03 2018 -0500 libafsrpc: export some more rx functions Change-Id: I6aea7eff7a5bc957896a5a7457a945dd0feaec88 Reviewed-on: https://gerrit.openafs.org/13138 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit f01ee714152a0a6247f2f456aa1f0a728d74373c Author: Benjamin Kaduk Date: Thu May 31 18:40:21 2018 -0500 Export missing opr functions from libafsrpc Our assertion macros expand to function calls, and we have assertions included in macros in installed headers, so the public needs to be able to link against them. Export for both Unix and Windows. Change-Id: Ibd1da844f274398e9296f00241b1be48bb95e4fe Reviewed-on: https://gerrit.openafs.org/13137 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit c12cfd7331727142cb928e08ec32a708d0cfd1e9 Author: Benjamin Kaduk Date: Sun May 27 22:54:01 2018 -0500 libafsauthent: export additional xdr_ functions Formally, we need to use xdr_free to deallocate storage for RPC output variables, in case the XDR stack uses a different allocator than the standard application allocator. Some types have non-autogenerated wrappers exposed already (e.g., token_FreeSet()), but for a handful of the base ptint types we need to expose the xdr routines in order for a safe way to deallocate their storage to be available. Change-Id: Iaac349cfaa1a07d5908a88e4c230874c6301471a Reviewed-on: https://gerrit.openafs.org/13131 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 12f4fd2901fee8bf27c2cec97efd3d242c6ff025 Author: Andrew Deason Date: Thu Apr 26 12:27:12 2018 -0500 afs: Stop looking for dcaches on Get*DSlot errors In various places in the code, we'll be looking for a dslot, calling afs_GetValidDSlot (or afs_GetUnusedDSlot) in a loop. In a few places, we currently keep looking for the dslot when we get an error back, since afs_GetValidDSlot may return successfully for other slots, and we might find the dslot we're looking for. This behavior was introduced in a few commits, including: - commit 2679af76 (afs: Traverse discard/free dslot list if errors) - commit 00fd34a6 (afs: Handle easy GetValidDSlot errors) - commit 9a558660 (afs: Cope with afs_GetValidDSlot errors) This behavior means that if afs_GetValidDSlot/afs_GetUnusedDSlot returns an error for a particular dcache slot, but other slots are okay, then we may still find the dcache we're looking for. However, by far the most common reason that afs_GetValidDSlot/afs_GetUnusedDSlot fails is because our disk cache is completely unusable; it is very rare that only a few slots cannot be used, but others are fine (this would mean that the disk cache was corrupted in oddly specific ways, or there are small isolated errors in the underlying disk). So continuing the dcache search in these situations is not very useful. On Linux, this is most commonly seen by the underlying disk cache i/o calls returning -EINTR, which can happen if a SIGKILL signal is pending for the current process when we try to do the i/o. In this situation, all attempts to read in a dslot from disk will fail; trying other slots or waiting will not improve the situation. Depending on which specific code path encounters an afs_Get*DSlot error, we can then flood the log with "disk cache read error in CacheItems" messages emitted from afs_UFSGetDSlot, since we keep calling afs_Get*DSlot in our loop. The worst offender of this is usually afs_GetDSlotFromList via afs_AllocDCache, since we end up calling afs_GetUnusedDSlot for every single dslot in the free and discard lists. However, our other call sites that are looking for dcaches for a specific file can still generate quite a few of these messages, since we'll end up calling afs_GetValidDSlot for every slot in a dcache hash chain. So to avoid flooding the log in these situations, change most callers of afs_GetValidDSlot and afs_GetUnusedDSlot to stop on the first error, and act like we never found a dcache that we were looking for. This commit also adjusts one caller in afs_ProcessOpCreate, which was not handling errors from afs_GetValidDSlot at all, and changes FlushVolumeData to be able to return error codes. Change-Id: I3047da690d39c000ef59dfc0ad526ecc5e382104 Reviewed-on: https://gerrit.openafs.org/13034 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit bec329c1c81d96b5933527f7cdb3638f24833087 Author: Andrew Deason Date: Thu Apr 26 12:01:57 2018 -0500 afs: Avoid GetDCache delays on screwy cache Currently, if our afs_AllocDCache call fails in afs_GetDCache, we retry once per second for 5 minutes. The reasoning is that we're out of dcache slots, and so if we wait a little while, maybe something will become freeable and we can continue. However, afs_AllocDCache can also fail if we have plenty of free dslots, but we are unable to successfully call afs_GetUnusedDSlot() on any of them. This can happen if our disk cache is screwed up, and so waiting and retrying will not make things better (but we'll spew a ton of "disk cache read error in CacheItems slot" errors in the log each time, and do so 300 times). So instead, only do our sleep/retry loop if we actually appear to be out of free or discarded dslots. Otherwise, just return an error immediately, since sleeping and retrying will not make anything better. Change-Id: I331913ab882216e3f71cc44da91f7f7d33c34004 Reviewed-on: https://gerrit.openafs.org/13033 Reviewed-by: Mark Vitale Reviewed-by: Michael Meffie Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 0ff2364bd5e68c0a7587f8fbc552bf20b99d7039 Author: Andrew Deason Date: Thu Apr 26 12:02:18 2018 -0500 afs: Avoid GetDCache panic on AllocDCache failure Currently, in afs_GetDCache, if afs_AllocDCache fails, we retry for 5 minutes and then panic. Panicing in this situation is completely unnecessary; afs_GetDCache can fail for a variety of other mundane reasons (such as, if we can't fetch the requested data from the relevant fileserver). It may seem unusual for afs_AllocDCache to fail for over 5 minutes (this is supposed to mean that we're out of dslots, and our attempts to free up dslots have failed). However, afs_AllocDCache can also fail if we are having issues in accessing the disk cache, and so we may not be out of cache space or dslots at all; we just can't access the cache. In this case, afs_AllocDCache can easily fail forever; waiting longer or trying to free up cache space isn't going to help. So, to avoid panicing in such situations, just make afs_GetDCache return an error. We just need to make sure afs_xdcache is unlocked, and then we can just jump to 'done', like plenty of other codepaths do; no extra cleanup is required. Also since we are removing a panic, add a log message when this situation happens, so EIO errors don't suddenly pop up silently. Change-Id: I9b8dd6c861b8066822c44758566c05abd7dc1660 Reviewed-on: https://gerrit.openafs.org/13032 Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk Reviewed-by: Michael Meffie Reviewed-by: Marcio Brito Barbosa Tested-by: BuildBot commit 35b6d2a6d5a1ca13544a217a35688e9a0f6b6ec6 Author: Andrew Deason Date: Wed Feb 28 18:25:46 2018 -0600 rxgk: Define some protocol constants rxgk_int.xg is missing a few constants mentioned in the respective protocol specs: - The RPC-L definitions for PrAuthName are defined, but no PRAUTHTYPE_* constants for the 'kind' field are defined. Define at least PRAUTHTYPE_GSS, which rxgk uses. - The rxgk spec indicates a size of 20 for the nonces used in rxgk challenge and response packets. Define a constant (RXGK_CHALLENGE_NONCE_LEN) for this value, to make it easier to define similarly-sized structures. - The rxgk-afs spec defines the time value of 0 as a special "never expires" value. Define a constant (RXGK_NEVERDATE) to represent it. Change-Id: I07e1a1b19d1c887fd3e1a1d0f270d5af7b8581b0 Reviewed-on: https://gerrit.openafs.org/12939 Reviewed-by: Mark Vitale Reviewed-by: Michael Meffie Reviewed-by: Marcio Brito Barbosa Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 27d7b8fe4603c39362983758fe6a749fa5ffa4e5 Author: Mark Vitale Date: Fri May 4 15:42:14 2018 -0400 ubik: make ContactQuorum_* routines static Most of the ContactQuorum_* routines are only used in ubik.c, so make them all static - except for ContactQuorum_DISK_SetVersion, which is called from disk.c. Change-Id: I7d1ccd839e01ea8ee8d768dd369a892773361b05 Reviewed-on: https://gerrit.openafs.org/13078 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 8b1e730c11a6ed7dc067ef185302bd57a69f6d1e Author: Mark Vitale Date: Wed May 9 16:50:55 2018 -0400 ubik: remove unused ContactQuorum_DISK_Write This function is not used; remove it. No functional change is incurred by this commit. Change-Id: I7e3bb26fb62b0e28c8703154eb3df384d4dbc32d Reviewed-on: https://gerrit.openafs.org/13077 Reviewed-by: Andrew Deason Reviewed-by: Michael Meffie Reviewed-by: Marcio Brito Barbosa Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit b9fe4d4290ad19faf3b5fb5dc0c3b1ee3ee5ab69 Author: Mark Vitale Date: Mon May 8 17:50:00 2017 -0400 ubik: disambiguate "Synchonize database with server" msgs Ubik issues the same message in two very different cases: - sync server issues DISK_GetFile to obtain the latest version - non-sync server receives DISK_SendFile from the sync server Modify the messages so they provide more information and are distinguishable from each other. Change-Id: I99e8adc7229260f478a0df15791216e090d2e113 Reviewed-on: https://gerrit.openafs.org/12615 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit fdc8adbf0904cbbc0590379c5cb702a15273b40c Author: Mark Vitale Date: Tue Jun 5 14:12:20 2018 -0400 xdr: remove dead code, whitespace from xdr_enum The 'enum sizecheck' declaration has been unused since openafs-ibm-1_0; it is apparently vestigial from the original XDR code. Remove it, along with some extraneous whitespace. No functional change is incurred by this commit. Change-Id: I9f725ab6aff6cafa911975e9edaed8f07c8a328a Reviewed-on: https://gerrit.openafs.org/13076 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit eb1d2ef203a2a99c908b3b89d9ea8337a91b944b Author: Mark Vitale Date: Wed Jun 6 15:23:26 2018 -0400 xdr: avoid xdr_enum memory overrun Since openafs-ibm-1_0, xdr_enum has used xdr_long to read and write, even though enum_t is defined as int. For systems where sizeof(int) == sizeof(long), this works by accident. But other systems (e.g., DARWIN ARCHFLAGS=x86_64) xdr_enum will overrun its int-sized second parameter. For XDR_DECODE, this results in memory corruption. This was first noticed with OpenAFS 1.8.0 on macOS 10.13; if aklog is issued while already holding a token, it will fail in token_SetsEquivalent with a segfault in decodeToken. The root cause is that the address passed to decodeToken had been overwritten by a previous call to tokenType -> xdr_enum -> xdr_long. Instead, modify xdr_enum to use xdr_int for its work. Change-Id: I671d55588d88e0640f365624b83bd04b53dc97cc Reviewed-on: https://gerrit.openafs.org/13075 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit ef6a1e8118a25b885889179739a3539a598068bc Author: Benjamin Kaduk Date: Sun May 27 16:23:16 2018 -0500 libafsauthent: export ugen_ClientInit* Windows was only exporting the bare version and not the Cell/Flags/Server versions; Unix was exporting none of them. These routines for obtaining a ubik client are more generic than the historical (and already exported) ubik_ClientInit routine, allowing for the use of an alternative configuration directory, additional flags, and the like. Change-Id: I6577ef5f95d2b801c049befa9fddd3b605ff80f5 Reviewed-on: https://gerrit.openafs.org/13130 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 1974eac772157651594c1b76ea8f55e4567b3ec5 Author: Benjamin Kaduk Date: Sun May 27 16:03:12 2018 -0500 libafsauthent: Export more token-manipulation functions For both Windows and Unix. Change-Id: Icd90a2fd3f674b13dd44323d9bc20a8f1070a16e Reviewed-on: https://gerrit.openafs.org/13129 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 4008f83ca80c5ed7b612a13f760b4bb8b9866f2b Author: Benjamin Kaduk Date: Sun May 27 15:18:12 2018 -0500 libafsauthent: export ktc token 'Ex' routines for Unix We need these to handle the modern identity structures (they are already exported on Windows). Change-Id: I3a3f766e9c9a9fad96f2656c4f066a67cacee4a6 Reviewed-on: https://gerrit.openafs.org/13128 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit cdd1f16f5ef52093a8f7d3f87a45775d3c87b780 Author: Benjamin Kaduk Date: Sun May 27 14:18:07 2018 -0500 libafsauthent: export more afsconf_ functions We have new functions for (among other things) typed keys, and generic rx identity management; expose them as well as the legacy key- and user- management functions, on both Unix and Windows. Change-Id: Id9bc394d631f9c00915520aff763af497ef2035b Reviewed-on: https://gerrit.openafs.org/13127 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit bcce41bd99b4361631b64cf4749d1dcf80df1cd7 Author: Benjamin Kaduk Date: Sun May 27 13:11:05 2018 -0500 Synchronize libafsauthent afsconf_ exports with windows The Windows library was exporting several more afsconf_* symbols than the Unix one; bring them into sync. Change-Id: Ifba074124a0a3cfeed256553d7dbedbebd3c2996 Reviewed-on: https://gerrit.openafs.org/13126 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 1dc9bb4e7362029db073250f23a09f949e1655de Author: Mark Vitale Date: Fri May 25 17:05:28 2018 -0400 afs: fix broken volume callbacks (e.g. vos release) Commit e99bfcfaa3bca3e65f03928718c2c9eb5eff7c8c ('afs: use jenkins hash for dcache, vcache tables') introduced new hashing implementations for the dcache and vcache hash tables. Unfortunately, a typo introduced a bug into the VCHashV hash function; instead of hashing by volume id, it currently hashes by vnode. The most common symptom is that volume callbacks (RXAFSCB_Callback with fid :0:0) fail to find and invalidate all the files for the specified volume. This typically manifests as persistent stale RO content after a 'vos release' for new RW content. This bug only affects the Unix cache manager; the Windows cache manager implementation of RXAFSCB_Callback was unaffected. Change-Id: I7edca660671b880a69f0c499d54adffbbe62d2b2 Reviewed-on: https://gerrit.openafs.org/13090 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit e71985bce593e9dba43443e084eb726fcc5259e3 Author: Pat Riehecky Date: Fri May 25 12:03:35 2018 -0500 Remove pointless assignments scan-build identified these var assignements as being unused or redundant. Change-Id: I3b51e3e1503c0724a2cf1bab37e1c02f4ae533b2 Reviewed-on: https://gerrit.openafs.org/13086 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 9670937d5f12f1edc7bdcb588133f53ec1af2d6f Author: Pat Riehecky Date: Fri May 25 12:48:15 2018 -0500 Convert extended character set to unicode Change-Id: I9989f16ac670e007827ecfe8e02daf9b36d98d4e Reviewed-on: https://gerrit.openafs.org/13088 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 2b08d687b992f238fa59773ef2ff1710c520f861 Author: Pat Riehecky Date: Fri May 25 12:11:54 2018 -0500 Add missing va_end Per man va_start: Each invocation of va_start() must be matched by a corresponding invocation of va_end() in the same function. Change-Id: I703bb3e633435f9c9a62717333a6027476b6bab8 Reviewed-on: https://gerrit.openafs.org/13087 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit a411366f57dcf39cc17b6d61d8332e520dff57d1 Author: Pat Riehecky Date: Wed May 23 15:50:45 2018 -0500 Add braces to empty conditional blocks GCC 7+ is able to quickly optimize away empty if/else blocks if the braces are provided. While this adds some additional syntax, it should also result in faster optimization, so change our empty blocks after conditionals to use braces. FIXES 134377 Change-Id: I2b5e39fd8a3819e07077c2a4f28a9aa5ac432e1e Reviewed-on: https://gerrit.openafs.org/13081 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 759f29cfdfabed4dc5c1b96a0b2b79a3f83c08e3 Author: Michael Meffie Date: Mon Apr 25 11:19:10 2016 -0400 Windows: define AFS_IHANDLE_PIO_ENV for ihandle pio Support for positional i/o in the ihandle package was added to the windows platform in commit 50b6a116a1c412d0e6d7442d13d6e92c9dbb35ee using native windows functions. That commit also defined HAVE_PIO in the windows version of the afsconfig.h file. Unfortunately, that definition of HAVE_PIO is not limited to the ihandle package. Remove the project-wide HAVE_PIO definition from the windows afsconfig.h file and define the new AFS_IHANDLE_PIO_ENV symbol when position i/o support is available in the ihandle package. Build the fallback ih_pread and ih_pwrite functions (which use lseek) only when positional i/o is not available in the ihandle package for the current platform. Use AFS_IHANDLE_PIO_ENV instead of HAVE_PIO in ih_open() to determine when it is is safe to share ihandles among threads. Change-Id: I39b078177bc5a2f1daf8a8f8e6bfb1c76e6dfaf7 Reviewed-on: https://gerrit.openafs.org/12270 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit 343234d221ae8388f55748f5c494a42d5d69bfa0 Author: Michael Meffie Date: Mon Apr 25 11:06:11 2016 -0400 ubik: convert ubik_print to ViceLog Use the server logging macros instead of the utility functions to avoid function call overhead, especially at logging level 25. The server logging macros perform a logging level check in-line to avoid the unnecessary ubik_dprint* calls. Change-Id: Ia86efad6257b764f0922957017fe8326f0de76d3 Reviewed-on: https://gerrit.openafs.org/12619 Reviewed-by: Andrew Deason Reviewed-by: Marcio Brito Barbosa Reviewed-by: Michael Meffie Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 8225518cd08b810bf3d8c74e27e3d3a753b6b30b Author: Mark Vitale Date: Tue Apr 24 14:41:11 2018 -0400 ptserver: improve PR_GetHostCPS logging The IP address of the host is logged as a signed number. Instead, log it as the unsigned (and hex) representation of the host IP addr. Change-Id: Ic8b2b7da852a3dc7e9984b63da70d0403845452e Reviewed-on: https://gerrit.openafs.org/13043 Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 849ddd4fde0759e385cf3ed4054fc11c36a62fc3 Author: Benjamin Kaduk Date: Sat May 5 15:59:08 2018 -0500 Export afs_getDirPath from shared libraries Add this function to the export list for libafsauthent on Windows and Unix. Change-Id: Ib6f219e407b75a6052d6e29008977c8545b2aa36 Reviewed-on: https://gerrit.openafs.org/13059 Reviewed-by: Anders Kaseorg Tested-by: Anders Kaseorg Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 554c38473d1465af4c4613209229c274807fffd8 Author: Benjamin Kaduk Date: Sat May 5 15:42:51 2018 -0500 Rename getDirPath to afs_getDirPath in preparation for export The symbol name getDirPath is rather generic and we probably shouldn't squat on it in the application's namespace. In preparation for exporting this functionality from the Unix shared libraries, rename it to afs_getDirPath. Retain a Windows-only wrapper getDirPath that can continue to be exported from libafsauthent on Windows, for ABI compatibility. New consumers should use afs_getDirPath. Change-Id: Ie3f3f7b0662451353834d2e3b5c3dd1131c1935e Reviewed-on: https://gerrit.openafs.org/13058 Tested-by: BuildBot Reviewed-by: Anders Kaseorg Tested-by: Anders Kaseorg Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit b48fe6b57f13bacb368e27389ccd3f9c279822da Author: Benjamin Kaduk Date: Sat May 5 15:35:03 2018 -0500 Remove duplicates from liboafs_util.la.sym Remove the extra copy of things which appeared twice. Change-Id: I95542172f28759852a76589d05845869cf7e9c9a Reviewed-on: https://gerrit.openafs.org/13057 Tested-by: BuildBot Reviewed-by: Anders Kaseorg Tested-by: Anders Kaseorg Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 3be1de0e823db7068e27b9c5c30a91673f058e52 Author: Benjamin Kaduk Date: Sat May 5 14:42:31 2018 -0500 Export ubik_PR_ symbols from libafsauthent Also export from liboafs_prot the ones missing from this set. This brings the unix exports in sync with the Windows exports (of ubik_PR_ symbols), and is tested as being sufficient to compile python-afs. Change-Id: I77941aa7fbbcb154c67769fe875474920d86d756 Reviewed-on: https://gerrit.openafs.org/13056 Tested-by: BuildBot Tested-by: Anders Kaseorg Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 70c4922980d1596155b4021cd72d6895c2371e23 Author: Benjamin Kaduk Date: Sat May 5 14:00:27 2018 -0500 Export comerr initialization functions from libafsauthent Add to the libafsauthent export symbol list these comerr initialization functions so that they are usable by consumers. Change-Id: I72c6f9402a46aff6fa2719c0b9e0974c7ff7b57e Reviewed-on: https://gerrit.openafs.org/13055 Tested-by: BuildBot Reviewed-by: Anders Kaseorg Tested-by: Anders Kaseorg Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 792dd44ac57032a3f2a4743c83c8a0208a08ecec Author: Benjamin Kaduk Date: Sat May 5 13:11:00 2018 -0500 opr: install afs/opr.h and opr/lock.h These headers are (transitively) referenced from rx_pthread.h, which is pulled in from rx.h when AFS_PTHREAD_ENV is defined. As such, we are presenting an incomplete public API without this header. Change-Id: I8afd1d635534910739ec37d56201a86998962cfa Reviewed-on: https://gerrit.openafs.org/13054 Tested-by: BuildBot Reviewed-by: Anders Kaseorg Tested-by: Anders Kaseorg Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 845c8927ef20e245bb88bc783dc2e581b61fbaba Author: Mark Vitale Date: Fri May 19 16:34:21 2017 -0400 ubik: remove redundant memset from udisk_write When udisk_write is extending the database, DRead will return a null buffer. udisk_write then calls DNew to get a brand new buffer for the extension write, and clears it with memset. However, this is redundant, since DNew has already cleared the new buffer. Remove the redundant memset. No functional change should be incurred by this commit. Change-Id: Ia6768098fb3c67475c8948c874b92b91bf17cdb7 Reviewed-on: https://gerrit.openafs.org/12621 Reviewed-by: Benjamin Kaduk Reviewed-by: Andrew Deason Reviewed-by: Michael Meffie Reviewed-by: Marcio Brito Barbosa Tested-by: BuildBot commit e4c7321560acf0bd34eeee7d46269818d82fdb44 Author: Mark Vitale Date: Wed May 17 16:32:20 2017 -0400 ubik: death to orphaned signals ubik has a few very old "orphaned" LWP events that are signalled via LWP_NoYieldSignal, but have no matching waits (LWP_WaitProcess). Each "signal" runs the LWP waiting element list for each LWP on the blocked queue; this may add up to substantial wasted overhead on a heavily loaded ubik server. Remove the orphaned signals. No functional difference should be incurred by this commit. Change-Id: I66eba45975a829216e7af1927e51ec6aab63f570 Reviewed-on: https://gerrit.openafs.org/12620 Reviewed-by: Andrew Deason Reviewed-by: Michael Meffie Reviewed-by: Marcio Brito Barbosa Reviewed-by: Mark Vitale Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 55013a111394052a0253c87a744d03dfabd1be75 Author: Pat Riehecky Date: Wed May 23 15:42:09 2018 -0500 lwp: Fix possible memory leak from scan-build It is possible for LWP_CreateProcess to return early. When it does, it should free up any memory it allocated before leaving scope. Change-Id: Ib5644d36dc01bbac33804f4a039661ce2c78969d Reviewed-on: https://gerrit.openafs.org/13080 Reviewed-by: Andrew Deason Reviewed-by: Marcio Brito Barbosa Reviewed-by: Mark Vitale Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 850c7c50dccbdebb8e0a44da4fc7840760d9e02d Author: Michael Meffie Date: Fri Apr 27 23:08:34 2018 -0400 util: check for trailing characters in partition names The function which maps partition names to partition ids currently ignores trailing characters in the partition names. For example, the partition name "/vicepbogus" is currently considered a valid partition name ("/vicepbogus" maps to "bo" which is id 66). Although this is not a regression, it is problematic for several reasons. Firstly, this can lead to duplicate partition ids on the server, for example "/vicepbad" and "/vicepbar" both map to the same partition id ("ba" is id 52). Second, partitions are internally tracked by numeric id. The partition names are generated from numeric ids when reporting partition names. This means the trailing characters are lost when reporting the partition names. For example, vos reports the attached partition "/vicepbad" as "/vicepba". Third, it could be possible (but perhaps unlikely) in the future to extend the range of partition ids, so the trailing characters could become significant at that time. Finally, it could be confusing to admins that such partition names are attached by the fileserver. For example, "/vicepaa-backup" is attached and is used by the fileserver as partition id 26. This change adds a check for trailing characters in partition names in the volutil_GetPartitionID function, so it is more strict in what it accepts as a valid partition name. That function will now return -1 (illegal partition name) when trailing characters are found in partition names. Change-Id: Iad9aee05fcf439cac9afcd89cf367be693261fbd Reviewed-on: https://gerrit.openafs.org/13039 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot Reviewed-by: Andrew Deason commit c0f2c26e9298d12209fbb5e523ea3173202316e5 Author: Michael Meffie Date: Fri Apr 27 22:59:57 2018 -0400 vol: check for bad partition names Currently, servers attempt to attach any partition name starting with "/vicep", even partition names which map to out of range partition ids. Examples of such misnamed partitions are "/vicepzz", "/vicep0", and others. The presence of these misnamed partitions cause the server processes to crash on startup, since the out of range partition ids are used as an index. Add a check for the bad partition names in VCheckPartitions to avoid attaching them. Log a warning for such partitions to let the admins know why the partitions are not attached. Change-Id: I553ce6cc8bc751b9ed789312f7efb4e0f737a52e Reviewed-on: https://gerrit.openafs.org/13038 Reviewed-by: Benjamin Kaduk Reviewed-by: Marcio Brito Barbosa Reviewed-by: Andrew Deason Reviewed-by: Mark Vitale Tested-by: Benjamin Kaduk commit f1d389e80367c7ea532441f9aa27a6cc3e2853a7 Author: Andrew Deason Date: Thu May 10 16:23:48 2018 -0500 ubik: Make udisk_Log* functions static Nothing uses the udisk_Log* functions outside of disk.c. Declare these static to make sure they stay that way, to make it easier to change their semantics. Change-Id: I068684782b22af788ce892c995a6d80f2d9fb2e0 Reviewed-on: https://gerrit.openafs.org/13069 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit b8617f08d1bf57a6b3fbba44e5b4de24dc84a9bb Author: Andrew Deason Date: Thu May 10 16:05:10 2018 -0500 ubik: Remove 'mtime' from ubik_stat Nothing uses the 'mtime' field from ubik_stat. Remove it. Change-Id: I7611a7ca5aa5743be43aefafeda5ecf9a5d47598 Reviewed-on: https://gerrit.openafs.org/13068 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit f045de21a45fcc8f71e2b30e826c22c8a7b4d0f2 Author: Jeffrey Altman Date: Fri May 11 15:44:24 2018 -0400 viced: SRXAFS_InlineBulkStatus set InterfaceVersion on error AFSFetchStatus.InterfaceVersion is required to be "1" for any of the fields in the structure to be considered valid. Therefore, InterfaceVersion must be set to one when returning an 'errorCode' value. When RXAFS_InlineBulkStatus was introduced by OpenAFS in 362d26c733b086d26f013bd229af979a112098f5 not only wasn't InterfaceVersion set but neither was the memory allocated to OutStats initialized. As a result the InterfaceVersion field value could be not only zero but random. The OutStats memory was initialized to zeros beginning with 726e1e13ff93e2cc1ac21964dc8d906869e64406. Change-Id: I5ca1b08cb32d01843a1c6dee87d8ba1d560396c8 Reviewed-on: https://gerrit.openafs.org/13067 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 3cc22a442e1dad628f0b11a32c4037fc7174dde4 Author: Marcio Barbosa Date: Tue May 15 17:10:45 2018 -0400 ubik: clones should not request votes Clones should not be able to become the sync-site. To make it possible, regular sites do not vote for a site tagged as clone. In other words, the clones ask for votes but they cannot be the sync-site. Knowing that their requests for votes should be refused by the regular sites, they should never have enough votes to win the election. In addition to the unnecessary network traffic created by these unnecessary requests, this current approach can be problematic in some specific situations. As an example, consider the following scenario: The user wants to turn a regular site, called host1, into a clone. To do so, he runs the following commands on every single server: $ bos removehost -server -host host1 $ bos addhost -server -host host1 -clone After that, he restarts the servers, one by one. Depending on the delay between the restarts, a clone can become the sync-site. This is possible because the clones request votes from the other sites. If enough regular sites are not aware (yet) that the request for vote came from a clone, the clone in question can get enough votes to win the election. To fix the problems mentioned above, do not request votes if you cannot be the sync-site. Change-Id: Ic3569af8264dfff32f2a86b8dd99b922193f010a Reviewed-on: https://gerrit.openafs.org/12654 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 8e740aed774d4507e656e6ae743f6c6fe6c0e356 Author: Marcio Barbosa Date: Thu May 10 00:46:01 2018 -0300 afs: alloc openafs_lck_grp before osi_Init() on darwin Commit a27bed59cae1a4244429c752edfde0a8363c8a3b moved init_hckernel_init to osi_Init. On Darwin (AFS_DARWIN80_ENV), MUTEX_INIT (called by init_hckernel_init) uses openafs_lck_grp as the argument of one of the functions called during the initialization of the mutex in question. Since openafs_lck_grp was not allocated yet, we crash. To fix this problem, call MUTEX_SETUP() before osi_Init() on Darwin. Change-Id: Ib53118208d3ca7982e712768f334299e3d948805 Reviewed-on: https://gerrit.openafs.org/13065 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit c16423ec4e678e5cb01dc99f4115065f8ef6caf7 Author: Marcio Barbosa Date: Mon May 14 16:46:26 2018 -0300 rx: fix atomics on darwin As described by commit b2a21422129ca1eeeb5ea1a1f7b08b537fd2a9f7, the API used for atomic operations in kernel space is not the same as the one used in user space. To fix this problem, the commit mentioned above introduced macros to correct the name of these functions in kernel space. Unfortunately, the return value of the functions used in kernel space is not the same as the ones used in user space. Generally speaking, the kernel space atomic functions return the original value of the variable received as an argument before the operation in question. On the other hand, the user space atomic functions return the new value, after the operation has been performed. To fix this problem, this commit provides a new set of inline functions (only used in kernel space) with the expected return values. Also, in order to get the inline implementations of the OSAtomic interfaces in terms of the primitives, commit 74f837fd943ddfa20d349a83d6286a0183cb4663 defines OSATOMIC_USE_INLINED on OS X 10.12. However, the definition of this macro only affects the user space legacy interfaces for atomic operations. The kernel space interfaces for atomics are not deprecated and OSATOMIC_USE_INLINED does not affect these functions. To fix this problem, only define OSATOMIC_USE_INLINED in user space (OS X 10.12+). Change-Id: Ia6cbc76daa7068625dc9f6dff385d0568d6503bd Reviewed-on: https://gerrit.openafs.org/13063 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 96a4bee20d42484148d163b85ca049dcc980a7a5 Author: Andrew Deason Date: Tue May 8 19:09:42 2018 -0500 LINUX: Remove unused osi_fetchstore.c Ever since commit ae5f411c (Linux 4.4: Do not use splice()), most of osi_fetchstore.c has been '#if 0'd out. The only portion that isn't is a function definition that is unreferenced (afs_linux_read_actor). Remove the unused code, and other '#if 0' references to it; the code can always be added back later when we can actually use it. Change-Id: Ifc062d5665393aa6693eb0db63aa23e4feb44df4 Reviewed-on: https://gerrit.openafs.org/13061 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 46d5695a383b2b993fdd598b770f4e3c0e1a41f3 Author: Andrew Deason Date: Mon Apr 30 17:58:43 2018 -0500 afs: WriteThroughDSlots: Avoid write error panic Currently, afs_WriteThroughDSlots panics if our call to afs_WriteDCache fails. Since afs_WriteThroughDSlots is called every minute by a background daemon, this means that if our cache fs becomes inaccessible (by being forced read-only, or for any other reason), we are virtually guaranteed to panic relatively quickly. To try to avoid this at least for some cases, change afs_WriteThroughDSlots to return an error to our caller when we encounter such an error. For our background task, we can just ignore the error and retry the writes on a future iteration. During shutdown, we still panic if we encounter an error, to try to avoid silently allowing a corrupt cache to be used on subsequent boots. Change-Id: Ia5f180a5c709881c3e884629c02e9ff93729fa88 Reviewed-on: https://gerrit.openafs.org/13047 Reviewed-by: Benjamin Kaduk Reviewed-by: Michael Meffie Tested-by: BuildBot commit 22e64df8e043fa7bd78bff263866ee2bd6a6e13d Author: Andrew Deason Date: Mon Apr 30 17:33:14 2018 -0500 afs: Avoid afs_GetDCache panic on cache open error When we need to populate a dcache entry, afs_GetDCache calls afs_CFileOpen to get a handle for our file backing that dcache. Currently, if we cannot open the file, we panic. To handle this a little more gracefully, just return an error from afs_GetDCache instead. The relevant userspace request will probably fail with EIO, but this is better than possibly crashing the whole system. Change-Id: If570ecc7f0fd0aab8340b568fc6cb2e2d316f35a Reviewed-on: https://gerrit.openafs.org/13046 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 3ec0414f769c37a19410fbd9aefb086cb5b69e55 Author: Benjamin Kaduk Date: Tue May 8 18:04:21 2018 -0500 Use afs_DestroyReq in afs_PrefetchNoCache() Since commit 76ad941902c650a4a716168d3cbe68f62aef109f we use afs_DestroyReq() instead of osi_Free() directly. Also update the UKERNEL version of the function to afs_CreateReq() properly. FIXES 134533 Change-Id: I4a13f6232dbed12ee00ce219cb5f515529fff58c Reviewed-on: https://gerrit.openafs.org/13060 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit f6af4a155d3636e8f812e40c7169dd8902ae64be Author: Andrew Deason Date: Mon Apr 30 17:30:56 2018 -0500 LINUX: Return NULL for afs_linux_raw_open error Currently, afs_linux_raw_open (and by extension, LINUX's implementation of osi_UFSOpen) panic when they are unable to open the given cache file. To allow callers to handle the error more gracefully, change afs_linux_raw_open and osi_UFSOpen to return NULL on error, instead of panic'ing. Expand the language a little on the message logged while we're here, since the system might keep running after this situation now. This commit also changes all callers that did not already handle afs_linux_raw_open/osi_UFSOpen errors to assert on errors, so we still panic for all situations where we encounter an error. More graceful behavior will be added in future commits; this commit does not change the behavior on its own. An error on opening cache files can legitimately happen when there is corruption in the filesystem backing the disk cache, but possibly the easiest way to generate an error is if the filesystem has been forcibly mounted readonly (which can happen at runtime due to filesystem corruption or various hardware faults). The latter will generate -EROFS (-30) errors, but of course other errors are probably possible. Change-Id: I1462ec43c76c0b07e9368b37a9dbaedf6b6f4409 Reviewed-on: https://gerrit.openafs.org/13045 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 54e84a98f9747bb5bb2ad4b8031115ad7684c914 Author: Benjamin Kaduk Date: Fri Apr 13 08:07:59 2018 -0500 BSD: Work around panic in FlushVCache Commit 64cc7f0ca7a44bb214396c829268a541ab286c69 created the very useful afs_StaleVCache() helper function, but unfortunately it also introduced a subtle change into how we check for whether a vcache may be a directory. Previously, we just used the low bit of the Fid's Vnode number, since files have an even number and non-files an odd number. The new version uses that check but also explicitly checks `vType(avc)` against VDIR, and this new check involves consulting information stored in the associated vnode entry, not the vcache directly. The afs_FlushVCache() implementation for XBSD and DARWIN NULLs removes the cross-linkage between vcache and vnode, so that AFSTOV(avc) becomes NULL. Just a few lines later, it calls afs_StaleVCacheFlags(), at which point vType() dereferences a bad pointer (offset from a NULL pointer) and panics. This would happen during shutdown, or other periodic reclaim/flush events that can be scheduled. Change-Id: I0800e5c743cedcbec628bfa8c8ea8978c2488c1c Reviewed-on: https://gerrit.openafs.org/13014 Reviewed-by: Mark Vitale Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit cfa74883e4996dfee2bd6ffaa3b967e5a7941e0b Author: Stephan Wiesand Date: Thu Apr 26 19:50:06 2018 +0200 redhat: PACKAGE_VERSION macro no longer exists Commit 0d0e7699c9f789214205fe6837cded1a4c95f9c0 replaced all uses of the %PACKAGE_VERSION macro in the spec with the %version one, but missed an instance in the kmodtool script. Fix this, to avoid a warning during rpmbuild. Change-Id: I363241f45c5261aaf2fa0619fb159022f6dbd56a Reviewed-on: https://gerrit.openafs.org/13031 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 076b73e06df8240f209470ea6ee19b66eb4166c3 Author: Stephan Wiesand Date: Thu Apr 26 19:33:31 2018 +0200 redhat: Make separate debuginfo for kmods work with recent rpm Commit 443dd5367e0cd9050ad39a6594c5be521271b4e9 introduced the creation of separate debuginfo packages for kmod packages, and commmit 387ae9536888419d7b101513e04e1c644e3218d6 moved the code from the spec into the kmodtool script. Recent versions of rpm (the issue was found on Fedora 27) extract the debuginfo data from a copy of the original files having the package version-release as a suffix. This broke the original change since the regular expression passed to find-debuginfo.sh no longer matched the name of the openafs.ko file. The file list for the -debuginfo package remained empty, which caused rpmbuild to fail. Relax the regex to match the previous and current file names we are after. It is possible but unlikely that .*openafs\.ko.* will ever match any file not being a kernel module. Change-Id: I57178ed2c593551ede6f4ab2679dd0360dc362cf Reviewed-on: https://gerrit.openafs.org/13030 Tested-by: BuildBot Reviewed-by: Michael Meffie Tested-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 09f31d4c21328bcdc1dccdedf7df53d77c22e3e3 Author: Jeffrey Altman Date: Fri Feb 23 18:47:46 2018 -0500 rx: connection aborts send serial zero when no conn available When no connection object is available, send serial number zero (0) instead of one (1). There is no harm in sending one (1) but it might be confused as the first packet sent on the connection. Multiple connection aborts sent would all be sent with serial one (1). Serial number zero (0) can be an indication to humans reading packet traces that the sender has no knowledge of the connection. Change-Id: I1951284f810170bd130e4f1d8ed93b903cd66659 Reviewed-on: https://gerrit.openafs.org/12932 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit cacf2b646759132dbf21e9c04fb3cfc6c2f8f1f3 Author: Jeffrey Altman Date: Fri Feb 23 18:26:24 2018 -0500 rx: pass serial number to rxi_SendRawAbort The practice of stamping abort packets with the connection's next serial number was altered by a0ae8f514519b73ba7f7653bb78b9fc5b6e228f8. This change restores the prior behavior by passing a serial number as a parameter to rxi_SendRawAbort() so that the serial number can be obtained from the connection instead of hard coded as 1. Change-Id: I0fb516b2c596e675fa4bc44598a697de81d36d83 Reviewed-on: https://gerrit.openafs.org/12931 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 3d3e7bc51aaf39b5ca04bfd36ff9017ab0622057 Author: Michael Meffie Date: Mon Apr 9 19:54:54 2018 -0400 autoconf: add kernel module to the summary Add the kernel module to the list of optional build items in the configure summary to indicate whether the kernel module build is enabled. Change-Id: I11d247ac66d8119910a90a0240b0ce5854449db4 Reviewed-on: https://gerrit.openafs.org/13005 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 85e9db22b265f9bb3745246fea3a07158b8a8c0e Author: Michael Meffie Date: Mon Apr 9 19:50:28 2018 -0400 autoconf: remove uss from configure summary Commit 00a33b26d74aa067086ddc340efb82184715857f (uss: always build uss) made the uss build unconditional. Remove it from the list of optional items in the configure summary. Change-Id: Ia249451c574974b4f0892c4d6d626c57404ea8ce Reviewed-on: https://gerrit.openafs.org/13004 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 833a81eeda6e48ea1ced92169434e843d054c44d Author: Michael Meffie Date: Mon Apr 9 16:42:41 2018 -0400 autoconf: remove more linux 2.4 references Remove old linux 2.2 and 2.4 references in the autoconf macros left over from the linux 2.2 and 2.4 days. Change-Id: Ie859d938fa1fee1d98a035b55e5e41120b66bc69 Reviewed-on: https://gerrit.openafs.org/13003 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 28ea20d03f8abd8109547d6825edad159748397a Author: Michael Meffie Date: Thu Apr 5 23:43:34 2018 -0400 redhat: remove the openafs-kernel-version.sh script Commit ec706b21530240d7fb66bad2f08513eff8f7c335 (Remove Linux 2.4 compat from RedHat packaging) removed the use of the script openafs-kernel-version.sh, which was used in the linux 2.4 days to look up the current kernel version. Nowadays, we use the openafs-kmodtool script to determine the kernel version. Remove the unused openafs-kernel-version.sh script from the package sources. Change-Id: I6494812004f7b59c786ff670ff37c2fdc354f371 Reviewed-on: https://gerrit.openafs.org/12996 Tested-by: BuildBot Reviewed-by: Stephan Wiesand Reviewed-by: Benjamin Kaduk commit 9f0164f4254da39c3c31e0268da58ce7a6ccda1d Author: Michael Meffie Date: Thu Apr 5 22:56:50 2018 -0400 redhat: remove extra kernel version check Commit a1c072ac562ccf74e5afb8449db1bcef86aef362 (redhat: fix rpmbuild command line option defaults) added logic to set the default value of the kernvers variable when not specified as an rpmbuild command line option. This default value is not necessary, since 'kmodtool verrel' already returns the current running kernel version by default. The result of 'kmodtool verrel' sets the kverrel variable, which holds the value of the kernel version we are building. The kernvers variable is only used as an argument to 'kmodtool verrel' and may be empty by default to indicate the current version should be returned. Remove the unnecessary setting of the default value of kernvers. Also update the information banner to show the value of kverrel, which is the actual version we are building, instead of kernvers, which is empty be default. Change-Id: I45ded3b4f61ec60a64288b89c1d553df9fa7b867 Reviewed-on: https://gerrit.openafs.org/12995 Tested-by: BuildBot Reviewed-by: Stephan Wiesand Reviewed-by: Benjamin Kaduk commit 909d8358109445fdb316b68a8e55e17626cf17c9 Author: Ian Wienand Date: Tue Mar 20 14:01:43 2018 +1100 Remove warning "find_preferred_connection: no connection and !create" find_preferred_connection() is called with !create via afs_ConnByHost->afs_ConnBySA to determine if there is a cached connection available. Don't warn, as it will next be called with the create flag to create the connection anyway. Change-Id: I02c2150a04ef20c54da793926fb402b946311f9a Reviewed-on: https://gerrit.openafs.org/12964 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 154512831966d12c1e32e6271d4ab1440a25b96e Author: Stephan Wiesand Date: Wed Apr 4 17:09:39 2018 +0200 FBSD: param.h consistency Commit 88dc4d93f5ef080da8f56fac453f095e6c79d4a0 ("Add param.h files for recent FreeBSD") introduced an inconsistency between the i386 and amd64 param.h files for 11.1 and 12.0 regarding the *_FBSD101_ENV #defines. Citing Benjamin Kaduk: "Traditionally we have the param.h for a FreeBSD N.0 release include the (N-1).Y values that existed at the time of the N.0 release, and freeze that set of (N-1).Y values for the lifetime of FreeBSD N.x, if that makes sense." Given that FreeBSD 11.0 was released shortly after 10.3, and 12.0 is not yet released, consistently #define *_FBSD10{1..3}_ENV for 11.1 and *_FBSD10{1..4}_ENV for 12.0 Change-Id: Ibb7e6c4caaab7aa97b32eeec7aa0bbe998bb57f7 Reviewed-on: https://gerrit.openafs.org/12990 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 1a0d68676526a5031d7f06f44d58c6dbb2b65da7 Author: Marcio Barbosa Date: Thu Mar 29 15:52:12 2018 -0300 autoconf: remove check for lorder Currently, lorder is not being used. Remove the conditional that checks if this binary exists. Change-Id: I5ccee8b34f33ba0bda38a1d0478ff7a46f73f79c Reviewed-on: https://gerrit.openafs.org/12981 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 387ae9536888419d7b101513e04e1c644e3218d6 Author: Stephan Wiesand Date: Mon Mar 26 20:21:19 2018 +0200 redhat: Create unique debuginfo packages for kmods Commit 443dd5367e0cd9050ad39a6594c5be521271b4e9 ("redhat: separate debuginfo package for kmod rpm") introduced the creation of separate debuginfo packages for the kmod packages. As such, this is useful, but all debuginfo packages for a given OpenAFS release ended up with the same name/version/release for the kmod debuginfo package, no matter which kernel release or variant the kmod was built for. Move the additional black magic from the spec into the kmodtool script where we have the means to do better: Use the same naming and versioning conventions as for the kmod-openafs packages themselves. Change-Id: Ibcb34e4c8efde13d0600005772751d8aeb8154aa Reviewed-on: https://gerrit.openafs.org/12977 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Michael Meffie Tested-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 60a006bdc43df42e40eb43f1e1af7fffe3e85763 Author: Ben Kaduk Date: Fri Dec 13 16:25:47 2013 -0500 Export {Get,Set}ServiceSpecific from liboafs_rx.la rxgk will use service-specific data. Change-Id: Id9e2d4b9920e771e1583b9362e61de6216c246b4 Reviewed-on: https://gerrit.openafs.org/10589 Reviewed-by: Daria Phoebe Brashear Reviewed-by: Chas Williams <3chas3@gmail.com> Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Mark Vitale Reviewed-by: Michael Meffie Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk commit f70ab59f88aa41074c9f075368137bd663cc8bce Author: Ben Kaduk Date: Mon Dec 9 14:42:13 2013 -0500 Add some time-related helpers RXGK_NOW(), a quick routine to get the current timestamp as an rxgkTime, and secondsToRxgkTime for the more general scaling factor. Change-Id: I0051b5c8e5ad61e35431d97454bf2741daba90cb Reviewed-on: https://gerrit.openafs.org/10566 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Mark Vitale Reviewed-by: Michael Meffie Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk commit f47cb2d4a957910c3e7d4b755f41ddef5dd103c5 Author: Michael Meffie Date: Sun Jan 21 18:38:11 2018 -0500 Suppress statement not reached warnings under Solaris Studio Solaris Studio issues warnings for statements which can not be reached, such as statements following an infinite loop. For example, the return statement will generate a 'statement not reached' warning in the following code: while (1) { /* no breaks or gotos in this body */ } return 0; Suppress these warnings by conditionally removing such statements when building under Solaris Studio. Change-Id: Ib4f465bf9c00eff0d603e5bd643db7d3a5aa0ba0 Reviewed-on: https://gerrit.openafs.org/12958 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 306f0f3100e453e165032ae3bc9022b4a9a9a4c5 Author: Michael Meffie Date: Sat Jan 13 20:14:59 2018 -0500 afs: squash empty declaration warning Remove spurious semi-colon which generates a warning when building under Solaris Studio. "./src/afs/UKERNEL/sysincludes.h", line ...: warning: syntax error: empty declaration Change-Id: I022728ddfd4b8229db0a247de2470846c802a462 Reviewed-on: https://gerrit.openafs.org/12957 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit e0066095e7f74653c2c08d1b00010ba59f4c2cf3 Author: Michael Meffie Date: Sat Jan 20 18:34:18 2018 -0500 libafs: git ignore build artifacts on Solaris Ignore build artifacts generated when building the kernel module for Solaris: src/libafs/inet src/libafs/nfs src/libafs/ufs Change-Id: Ie791c45c48ffc15547864bee568f52f74ab6020f Reviewed-on: https://gerrit.openafs.org/12955 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 348dc87bb2eeb66d1e683dc91ee36724ee18f1af Author: Ben Kaduk Date: Fri Dec 13 16:17:54 2013 -0500 Export a few krb5 routines for rxgk We need oafs_h_krb5_generate_random_block when generating random keys and oafs_h_krb5_crypto_fx_cf2 for CombineTokens. Having oafs_h_krb5_crypto_prf_length proves very convenient for key derivation of transport keys, so move it to the public header and export it. oafs_h_krb5_enctype_keysize is needed so that we can tell whether or not we need to pass through random_to_key() when making rxgk_keys. oafs_h_krb5_random_to_key is needed for that random_to_key() operation. Change-Id: Ia34c8028b07df203b3885157e2d46c6bb512f608 Reviewed-on: https://gerrit.openafs.org/10936 Reviewed-by: Chas Williams <3chas3@gmail.com> Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit fe8a1f3a2b669057451cac358faa7320722dc053 Author: Ben Kaduk Date: Wed Dec 4 13:03:15 2013 -0500 auth: Let superuser identities be superusers We have a special rx_identity_kind for superusers, let it actually be useful for something. Change-Id: I1d551ed8e5fcfd6bdc29c6c27eee4c2ae67e1a89 Reviewed-on: https://gerrit.openafs.org/10575 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 00e12efa29659c28f0fd7b6acbfb57d91a6ca477 Author: Andrew Deason Date: Tue Mar 6 22:04:28 2018 -0600 SOLARIS: Check for map_addr() without 'vacalign' Add a configure check to see if the map_addr() function contains the 'vacalign' argument or not. The argument was removed sometime around Solaris 11.4. Change-Id: Id11c10cf849511635bd9490c97d978b4bdaa5e06 Reviewed-on: https://gerrit.openafs.org/12947 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 6082243e42525c738239fe429bcb64e0e4f22207 Author: Andrew Deason Date: Wed Mar 7 15:57:56 2018 -0600 hcrypto: Avoid arc4random in kernel Our HAVE_ARC4RANDOM symbol represents the availability of arc4random() in userspace, not in the kernel. On Solaris, we'll define HAVE_ARC4RANDOM, but the built kernel module will be unusable, since we cannot resolve the arc4random symbol. To to avoid this, undef HAVE_ARC4RANDOM when building hcrypto for the kernel, just like we do with HAVE_GETUID. Change-Id: I17472420b35e7be6b4f698082714c2e51bdb064b Reviewed-on: https://gerrit.openafs.org/12946 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 3e9ea6107973ccc4fa3d405f5b5d76666bfd624f Author: Andrew Deason Date: Wed Mar 7 13:28:34 2018 -0600 Avoid libtool 'nm' errors Starting around Solaris 11.3, '/usr/bin/nm -p' starts reporting some symbols with the 'C' code. libtool cannot handle this (libtool bug #22373), which causes global_symbol_pipe in the generated libtool script to be empty. This causes a rather confusing error when we go to actually use libtool to link something ("syntax error near unexpected token '|'"; see libtool bug #20947), and prevents the build from continuing. Address this in two ways: For all Solaris 11 builds, default to /usr/sfw/bin/gnm over /usr/bin/nm. This avoids any interop issues with libtool and nm, since libtool of course works very well with GNU tooling. In addition, try to catch any nm-related errors with libtool at configure time, to provide a more helpful error message. To implement these changes, create a wrapper around LT_INIT, called AFS_LT_INIT. Change-Id: I7d47c17f9d9401dc5dcc9676279bf1e4f53554c4 Reviewed-on: https://gerrit.openafs.org/12945 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 5a8b68153124c3a9224f0b6993df9de9c6c54541 Author: Michael Meffie Date: Thu Feb 22 13:23:18 2018 -0500 venus: convert fs.c to safer string functions Convert string handling to safer functions to avoid buffer overflows. Change-Id: Ibb4f18d78724d87a002e2b0458cba2cceee8670c Reviewed-on: https://gerrit.openafs.org/12923 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit c84f36a9b8c6b6adb9c77bab1c814ccd3aaf6a5b Author: Michael Meffie Date: Mon Feb 19 14:01:56 2018 -0500 venus: fix format overflow warning Recent versions of gcc generate a format overflow warning on the dfstring buffer in fs.c. Increase the size of the buffer to avoid a possible buffer overflow. fs.c: In function ‘AclToString’: fs.c:770:30: error: ‘%s’ directive writing up to 1024 bytes into a region of size between 13 and 23 [-Werror=format-overflow=] sprintf(dfsstring, " dfs:%d %s", acl->dfs, acl->cell); ^~ fs.c:770:2: note: ‘sprintf’ output between 8 and 1042 bytes into a destination of size 30 sprintf(dfsstring, " dfs:%d %s", acl->dfs, acl->cell); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Change-Id: Iead8b153a62f2928fabaeee1ed126535f67d7d49 Reviewed-on: https://gerrit.openafs.org/12917 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 70b7f743550a8ce02292a12c4188deaf85b1a533 Author: Michael Meffie Date: Thu Feb 22 16:07:55 2018 -0500 butc: convert butc/dump.c to safer string handling Convert butc/dump.c to safer string handling functions to avoid buffer overflows. Change-Id: I36338804ee5d0ac2eb818c42cf2671497cd5967f Reviewed-on: https://gerrit.openafs.org/12922 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit cec45d59440f55316097cfd6652d2ea26cd55233 Author: Michael Meffie Date: Mon Feb 19 13:57:16 2018 -0500 butc: fix format overflow warning Recent versions of gcc generate an overflow warning in the butc DUMPNAME macro when copying values into the finishedMsg1 buffer. Increase the size of the destination buffer to avoid a possible buffer overflow. dump.c:88:24: error: ‘%s’ directive writing up to 63 bytes into a region of size 50 [-Werror=format-overflow=] sprintf(dumpname, "%s (DumpId %u)", name, dbDumpId); ^ dump.c:1294:5: note: in expansion of macro ‘DUMPNAME’ DUMPNAME(finishedMsg1, nodePtr->dumpSetName, dparams.databaseDumpId); ^~~~~~~~ dump.c:88:6: note: ‘sprintf’ output between 12 and 84 bytes into a destination of size 50 sprintf(dumpname, "%s (DumpId %u)", name, dbDumpId); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ dump.c:1294:5: note: in expansion of macro ‘DUMPNAME’ DUMPNAME(finishedMsg1, nodePtr->dumpSetName, dparams.databaseDumpId); ^~~~~~~~ Change-Id: Iadf87a308ab6c500a8407a269bc0fd443ff0c735 Reviewed-on: https://gerrit.openafs.org/12916 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit c44f6f7a8052bdd1fb021e07bb6ae142b61e6b5b Author: Andrew Deason Date: Wed Mar 7 11:32:43 2018 -0600 ubik: Log sync site for SDISK_SendFile USYNC error In SDISK_SendFile, we return a USYNC error if the caller is not the sync site. Say who the sync site is when we do this, to possibly help post-mortem debugging. Change-Id: I62a3565fca20171be20481638c261c4659c68ab2 Reviewed-on: https://gerrit.openafs.org/12943 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit d0805d72b7a48dcaa7abe1aea136a8cd963d76c2 Author: Andrew Deason Date: Wed Mar 7 13:11:03 2018 -0600 Avoid empty libtool -export-symbols-regex pattern Currently, in LT_LDLIB_shlib_missing, we construct our -export-symbols-regex pattern like so (with some escaping): "($(sed -e 's/^/^/' -e 's/$/$/' xxx.sym | tr '\n' '|' | sed -e 's/|$//'))" The idea is that for a .sym file consisting of, for example: foo bar We then generate a regex like (^foo$|^bar$). However, since the 'tr' removes all newlines, the line given to the last 'sed' in the pipeline has no trailing newline. On some systems, such as Solaris, this causes sed to not output anything at all, resulting in a regex pattern of just "()". For example: # on Debian $ echo -n foo | sed -e 's/foo/bar/' bar$ # on Solaris $ echo -n foo | sed -e 's/foo/bar/' $ To avoid this, we can change the sed pipeline to not remove the newlines until the very end. Change the way we construct our regex to this instead: "($(sed -e 's/^/^/' -e 's/$/$|/' -e '$ s/|$//' xxx.sym | tr -d '\n'))" So the sed removes the extra '|' in the last element by looking at the last line, instead of looking at the end of the line after the 'tr' conversion. Change-Id: Id382132f6b400bf961dbaa52138a9abd0168118d Reviewed-on: https://gerrit.openafs.org/12944 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit c818f86b79a636532d396887d4f22cc196c86288 Author: Mark Vitale Date: Thu Mar 1 23:16:56 2018 -0500 LINUX: fix RedHat 7.5 ENOTDIR issues Red Hat Linux 7.5 beta introduces a new file->f_mode flag FMODE_KABI_ITERATE as a means for certain in-tree filesystems to indicate that they have implemented file operation iterate() instead of readdir(). The kernel routine iterate_dir() tests this flag to decide whether to invoke the file operation iterate() or readdir(). The OpenAFS configure script detects that the file operation iterate() is available under RH7.5 and so implements iterate() as afs_linux_readdir(). However, since OpenAFS does not set FMODE_KABI_ITERATE on any of its files, the kernel's iterate_dir() will not invoke iterate() for any OpenAFS files. OpenAFS has also not implemented readdir(), so iterate_dir() must return -ENOTDIR. Instead, modify OpenAFS to fall back to readdir() in this case. Change-Id: I242276150ab2a506e1e9c5c752e3f17d36c98935 Reviewed-on: https://gerrit.openafs.org/12935 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 79f33b859aeb3c91f2cce7597fdc138978c4e1d9 Author: Benjamin Kaduk Date: Thu Mar 1 20:28:23 2018 -0600 afs_pioctl: avoid -Wpointer-sign Change the declaration of 'addr' to be a signed int, to match RXAFS_CallBackRxConnAddr() and the afsd_pd_GetInt() used with it. This was detected by clang 4.0 in FreeBSD 11.1, via -Wpointer-sign. Change-Id: Ibd2679e6a4519db46f57693ff58221f18f6a2fe1 Reviewed-on: https://gerrit.openafs.org/12934 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit bd6a2484011dad6298c4ce97dd0cd68e0834baa5 Author: Marcio Barbosa Date: Thu Feb 22 17:53:23 2018 -0500 ubik: don't set database epoch to 0 if not needed If our attempt to receive a fresh database from a peer fails, we will overwrite the version.epoch field of our current local copy of the database with an invalid value, "0". The idea behind this approach is to make sure that this database will not be seen as a legit copy if the transfer is not completed properly. Although it is questionable if this approach is still necessary (since the current version writes the data into a temporary file), it is undisputed that the database version does not have to be invalidated if the transfer fails in a early stage where no data has been written and we could safely continue to reuse the local copy for read-only queries. Early failures may happen if: 1. The peer sending the database to us is not the peer we believe to be the sync site; 2. The sender is not authorized to call DISK_SendFile; In both cases, the database epoch is invalidated. As a result of that, we may have the following consequences: 1. Reads may not be allowed Once the on disk epoch is invalidated, if the server in question is rebooted, the invalid on disk epoch will be used to initialize the in memory epoch. At this point, reads may not be allowed since urecovery_AllBetter checks if the in memory epoch is greater than 1. Reads should not be blocked forever since the sync-site will send a new database to this remote and, as a result of that, the invalid version will be corrected. 2. Data can be lost If the site with the invalid epoch is the one with the most recent database, the database can be rolled back to an earlier version during a new quorum establishment. Consider the following scenario where we have three sites: Site A (up - database up to date) (sync-site) Site B (up - database up to date) Site C (down - old database) The epoch of B is invalidated due to the problem fixed by this patch. Then, A is turned off and C is turned on. In this scenario, the new sync-site will distribute the old database held by C since its epoch is greater than 0. To fix the problem in question, do not set the database epoch to 0 if the local database was not modified. Acknowledgements: Hartmut Reuter - found the problem; - suggested a possible solution; Benjamin Kaduk - submitted the first version; Andrew Deason - suggested changes; Change-Id: I4f6a6e92aa0bd4282fab4743ea622815a009fecf Reviewed-on: https://gerrit.openafs.org/12924 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot Reviewed-by: Michael Meffie commit 6d74e3d6a1becf86cec30efc2d01a5692167afe1 Author: Michael Meffie Date: Tue Feb 20 11:51:01 2018 -0500 afs: improve -volume-ttl error messages Change the afs call which sets the volume ttl value to return EFAULT instead of EINVAL when given an out of range value for the volume ttl parameter. This is more consistent with the other op codes, which return EFAULT when given an out of range parameter and allows the caller to distinguish between an invalid opcode and a bad parameter. Move the volume ttl range constants to afs_args.h, which is where constants related to the op codes are supposed to be defined. This makes the constants available to the caller in afsd.c as well as the implementation in afs_call.c. Update afsd to print a more sensible error message when the volume ttl set calls fails due to an out of range parameter. Change-Id: I6b3ab7d38a60464017daf06f70080a90d2a7a429 Reviewed-on: https://gerrit.openafs.org/12918 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 278581c24a802834719e0d57f27978321556c9bb Author: Michael Meffie Date: Tue Feb 20 20:31:11 2018 -0500 redhat: package libuafs perl bindings Require the swig package as a build dependency. Build and package the libuafs perl bindings. Place these libraries in the openafs-devel package, along with the man page (moved from the openfs-client package). This fixes an rpm build error when the swig package is present on the build system, RPM build errors: Installed (but unpackaged) file(s) found: /usr/lib64/perl/AFS/ukernel.pm /usr/lib64/perl/ukernel.so FIXES 134470 Change-Id: Ifa8a0938f0c16e6099cd2923a71dd6466052a4d8 Reviewed-on: https://gerrit.openafs.org/12919 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit f82d1c7d5aeae148305e867c1f79c6ea2f9e0a2a Author: Jeffrey Altman Date: Sat Feb 10 10:47:24 2018 -0500 rx: Do not count RXGEN_OPCODE towards abort threshold An RXGEN_OPCODE is returned for opcodes that are not implemented by the rx service. These opcodes might be deprecated opcodes that are no longer supported or more recently registered opcodes that have yet to be implemented. Clients should not be punished for issuing unsupported calls. The clients might be old and are issuing no longer supported calls or they might be newer and are issuing yet to be implemented calls as part of a feature test and fallback strategy. This change ignores RXGEN_OPCODE errors when deciding how to adjust the rx_call.abortCount. When an RXGEN_OPCODE abort is sent the rx_call.abortCount and rx_call.abortError are left unchanged which preserves the state for the next failing call. Note that this change intentionlly prevents the incrementing of the abortCount for client connections as they never send delay aborts. Change-Id: I87787e7ad0a85d52a01711bb75e2be1af9a868b8 Reviewed-on: https://gerrit.openafs.org/12906 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 3ddae7d168ac08c46b4e31517fdb1f6ac1ae63ac Author: Andrew Deason Date: Thu Feb 15 18:40:07 2018 -0600 RHEL: Add aarch64/arm64 to spec file Change-Id: I2247f40a839e976605e80cf468d7a023598d5dc5 Reviewed-on: https://gerrit.openafs.org/12911 Tested-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit e6c2624249a6ab96053c1d1134aec8e3f6bcee9e Author: Andrew Deason Date: Thu Feb 15 16:53:57 2018 -0600 doc: Edits to the 'afsd -volume-ttl' manpage Make a few misc changes to the text for the new -volume-ttl option: - Minor grammatical/typo fixes - Emphasize a little more that the default behavior allows for vldb info to be cached _forever_ - Provide some info on the effects of changing this value - Provide a suggested "typical" value, to give some clue as to what should be set here, so a curious user doesn't just set this to the first value they see (10 minutes) Change-Id: Ib6b2871b111c392260ea80e26273201b09d4c402 Reviewed-on: https://gerrit.openafs.org/12909 Reviewed-by: Benjamin Kaduk Tested-by: Andrew Deason commit a66629eac4dda4eea37b4f06e0850641cb2a7387 Author: Andrew Deason Date: Thu Feb 15 16:41:33 2018 -0600 rxdebug: NUL-terminate version before printing Currently, 'rxdebug -version' never initializes the buffer we read the version string into. Usually this is not noticeable, since all OpenAFS binaries tend to pad the Rx version response packet with NULs, so we get back several NULs to terminate the string. However, this is not guaranteed, and if we do not get back a NUL-terminated string, we can easily read beyond the end of the buffer. To avoid this, initialize the 'version' buffer with NULs before we do anything, and set the last byte to NUL, in case we exactly filled the buffer. Change-Id: I1b1ae546c01f018a9b4e198f918c2d9eb86015d6 Reviewed-on: https://gerrit.openafs.org/12908 Reviewed-by: Benjamin Kaduk Tested-by: Andrew Deason commit 4f7550dcaf9375046514cdd97cea0f667e955e9f Author: Andrew Deason Date: Sat Mar 7 17:27:47 2015 -0600 Add support for arm64_linux26 Add support for the arm64/aarch64 architecture on Linux 2.6+. The param header file is mostly combined from arm and amd64. Note that the code for syscall interception has not been updated for arm64, so this will not build on arm64 without support for kernel keyrings. This also does not define any AFS syscall number, since no number in the Linux arm64 syscall table is "free" for us to use, as far as I am aware. Adapted from initial patches from Micheal Waltz . Change-Id: I1ee239ded17d8fea3b91b70405215aa1b3f7a6e9 Reviewed-on: https://gerrit.openafs.org/11940 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit b792dea0f1f83673b0b045adf608412901b3024c Author: Andrew Deason Date: Sun Mar 8 11:47:28 2015 -0500 hcrypto: Avoid 'double' param in arm64 kernel code Currently, the RAND_add function in hcrypto uses a floating point argument (specifically, a 'double'), as well as any implementations of RAND_add. On Linux arm64, we cannot use floating point code in the kernel, since the kernel module is compiled with -mgeneral-regs-only, which prevents the use of floating point registers. No code in the tree actually makes use of this argument, but its mere presence is enough to cause an error with at least some versions of gcc with certain arguments. To get around this, simply change all instances of 'double' in hcrypto to be a void pointer instead. This allows the code to compile as long as nobody actually uses that argument in the kernel. If the code is changed such that we do actually use that argument, the argument will be a void* and so will probably (hopefully) cause a compiler error, and the code will need to be examined to make sure this workaround doesn't break anything. We already do this on Solaris, which has similar issues for different compiler versions and compiler flags. Add arm64 Linux to the cases where we do this, but restrict this to kernel code only, to try to avoid doing this more often than necessary. Change-Id: Ifd10786cd9ac6c9d5152b927e180b7362131f359 Reviewed-on: https://gerrit.openafs.org/11939 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 0a896b93c86e86f5b438880ef1634b4e39ee5779 Author: Andrew Deason Date: Fri Mar 13 10:33:05 2015 -0500 Do not set default AFS_SYSCALL Currently, afs_args.h will define an AFS_SYSCALL value by default (31) if the current platform does not define an AFS_SYSCALL value on its own (via its param.h info). This is dangerous, since if a platform does not define an AFS_SYSCALL, or if it happens to not be defined for any reason, some code may try to call syscall 31, which could be anything. So get rid of this. If this breaks the build on any platform, then that platform should define AFS_SYSCALL in its own platform-specific header, or get rid of the problematic AFS_SYSCALL usage. Change-Id: I9583c8e5adc4106848a437d81306000490787ef3 Reviewed-on: https://gerrit.openafs.org/11938 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit ed513bb516acdb28fc6bbf01714ef2e1df422a8a Author: Andrew Deason Date: Wed Mar 11 12:55:42 2015 -0500 Do not require AFS_SYSCALL Various parts of the code make use of AFS_SYSCALL in order to communicate with the libafs kernel module. Even though most modern platforms do not use an actual syscall anymore (instead using an ioctl-based method or similar to emulate the traditional AFS syscall), some code paths rely on AFS_SYSCALL as a fallback, or just use AFS_SYSCALL because they were never updated to use the newer methods. Even platforms that do not use the traditional AFS syscall still define the AFS_SYSCALL number, in case someone still uses it for something. However, some platforms do not have an AFS syscall number; there is no "slot" allocated to us, so we cannot safely issue any syscall. For those platforms, we must not reference AFS_SYSCALL at all, or we will fail to build. So, get rid of these references to AFS_SYSCALL if it is not defined. In some places, we can just avoid the relevant code making the syscall. In a few other places, we just pretend like the libafs kernel module was not loaded and yield an ENOSYS error, to make the code simpler. Change-Id: I38e033caf7149c2b1b567f9877221ca8551db2ea Reviewed-on: https://gerrit.openafs.org/11937 Tested-by: BuildBot Reviewed-by: Ian Wienand Reviewed-by: Benjamin Kaduk commit f5794e029903db79f345f42582230a1fd0f7d823 Author: Andrew Deason Date: Mon Feb 5 00:07:10 2018 -0600 util: Add the AFS_STRINGIZE() macro Add a macro to help with easily printing the value of #define'd constants, called AFS_STRINGIZE(). For example: printf("The value of AFS_SYSCALL is: " AFS_STRINGIZE(AFS_SYSCALL) "\n"); Change-Id: I19a3e9d930f1ca2085506957b4e96dff5bf1c22e Reviewed-on: https://gerrit.openafs.org/12893 Tested-by: BuildBot Reviewed-by: Ian Wienand Reviewed-by: Benjamin Kaduk commit 32d0493a7e4f74f5e5efdfde5eca29ed7d1bf3ec Author: Caitlyn Marko Date: Thu Feb 9 09:16:17 2017 -0500 SOLARIS: save kernel module function arguments for debugging Add the -Wu,-save_args compiler option when building kernel modules under Solaris 10 and 11 for the amd64 architecture. Binaries generated with this option save function arguments on the stack during function entry for debugging purposes. Up to six integer arguments are saved on function entry, and are not modified during the execution of the function. [mmeffie: commit message update] Change-Id: I7ee50e5108a46685efa17d0380883c6d1702a5e4 Reviewed-on: https://gerrit.openafs.org/12798 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 88cb536f99dc58fdbeb9fa6c47c26774241a0cb6 Author: Marcio Barbosa Date: Mon Feb 5 21:16:17 2018 +0000 autoconf: detect ctf-tools and add ctf to libafs CTF is a reduced form of debug information similar to DWARF and stab. It describes types and function prototypes. The principal objective of the format is to shrink the data size as much as possible so that it could be included in a production environment. MDB, DTrace, and other tools use CTF debug information to read and display structures correctly. This commit introduces a new configure option called --with-ctf-tools. This option can be used to specify an alternative path where the tools can be found. If the path is not provided, the tools will be searched in a set of default directories (including $PATH). The CTF debugging information will only be included if the corresponding --enable-debug / --enable-debug-kernel is specified. Note: at the moment, the Solaris kernel module is the only module benefited by this commit. Change-Id: If0a584377652a573dd1846eae30d42697af398d0 Reviewed-on: https://gerrit.openafs.org/12680 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit c7c71d2429cf685f3ffad6b2e6d102d900edc197 Author: Ian Wienand Date: Fri Feb 2 10:52:26 2018 +1100 Add .gitreview git-review [1] makes it much easier to submit changes. Add a default configuration file. [1] https://docs.openstack.org/infra/git-review/usage.html Change-Id: I9615a81c9b199c86e8de2fedc710e3246deeac84 Reviewed-on: https://gerrit.openafs.org/12884 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 5e09a694ec2c0cd20f5dee500eff6bc3dd04c097 Author: Mark Vitale Date: Tue Jun 30 01:54:21 2015 -0400 SOLARIS: Avoid vcache locks when flushing pages for RO vnodes We have multiple code paths that hold the following locks at the same time: - avc->lock for a vcache - The page lock for a page in 'avc' In order to avoid deadlocks, we need a consistent ordering for obtaining these two locks. The code in afs_putpage() currently obtains avc->lock before the page lock (Obtain*Lock is called before pvn_vplist_dirty). The code in afs_getpages() also obtains avc->lock before the page lock, but it does so in a loop for all requested pages (via pvn_getpages()). On the second iteration of that loop, it obtains avc->lock, and the page from the first iteration of the loop is still locked. Thus, it obtains a page lock before locking avc->lock in some cases. Since we have two code paths that obtain those two locks in a different order, a deadlock can occur. Fixing this properly requires changing at least one of those code paths, so the locks are taken in a consistent order. However, doing so is complex and will be done in a separate future commit. For this commit, we can avoid the deadlock for RO volumes by simply avoiding taking avc->lock in afs_putpages() at all while the pages are locked. Normally, we lock avc->lock because pvn_vplist_dirty() will call afs_putapage() for each dirty page (and afs_putapage() requires avc->lock held). But for RO volumes, we will have no dirty pages (because RO volumes cannot be written to from a client), and so afs_putapage() will never be called. So to avoid this deadlock issue for RO volumes, avoid taking avc->lock across the pvn_vplist_dirty() call in afs_putpage(). We now pass a dummy pageout callback function to pvn_vplist_dirty() instead, which should never be called, and which panics if it ever is. We still need to hold avc->lock a few other times during afs_putpage() for other minor reasons, but none of these hold page locks at the same time, so the deadlock issue is still avoided. [mmeffie: comments, and fix missing write lock, fix lock releases] [adeason: revised commit message] Change-Id: Iec11101147220828f319dae4027e7ab1f08483a6 Reviewed-on: https://gerrit.openafs.org/12247 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit 073522b3d49467af107d1143cfa015c53347e1e3 Author: Michael Meffie Date: Wed Jan 31 16:52:40 2018 -0500 add rfc3961.h to kernel sources Export this header to the kernel sources in the libafs_tree, since it is needed for the kernel module build. FIXES 134476 Change-Id: Id359c6d065c259601d14ee5c02b93647f86a0288 Reviewed-on: https://gerrit.openafs.org/12882 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 3ca1352170f87994d42578c5bc75e52c4103bc69 Author: Michael Meffie Date: Mon Feb 8 12:12:22 2016 -0500 CellServDB update 14 Mar 2017 Update all remaining copies of CellServDB in the tree, and make the Red Hat packaging use it by default too. Change-Id: I5a70a7c658ad0056cd10945bb730e84f0edfb730 Reviewed-on: https://gerrit.openafs.org/12880 Tested-by: BuildBot Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit 88dc4d93f5ef080da8f56fac453f095e6c79d4a0 Author: Benjamin Kaduk Date: Mon Jan 8 22:28:24 2018 -0600 Add param.h files for recent FreeBSD Add files for FreeBSD 10.4, 11.1, and 12.0 (12-CURRENT), for i386 and amd64. Change-Id: I904f576914bb965a659750e6302f011acf66ba81 Reviewed-on: https://gerrit.openafs.org/12863 Tested-by: BuildBot Reviewed-by: Stephan Wiesand Reviewed-by: Benjamin Kaduk commit c390f368a5012f866c1b4ce46d6ac6af6cef2fd5 Author: Benjamin Kaduk Date: Mon Jan 8 21:27:04 2018 -0600 FBSD: catch up to missing sysnames Add sysnames for i386 and amd64 10.4, 11.1, and 12.0 (12-CURRENT, at present). Change-Id: If38ecca7b2b3e40c186b7e9321ce017b4711139c Reviewed-on: https://gerrit.openafs.org/12862 Tested-by: BuildBot Reviewed-by: Stephan Wiesand Reviewed-by: Benjamin Kaduk commit f5c289d00aaf7c5525b477da5b89f6675456c211 Author: Marcio Barbosa Date: Wed Jun 21 16:24:05 2017 -0400 ubik: check if epoch is sane before db relabel The sync-site relabels its database at the end of the first write transaction. The new label will be equal to the time at which the sync-site in question first received its coordinator mandate. This time is stored by a global called ubik_epochTime. In order to make sure that the new database label is sane, only relabel the database if ubik_epochTime is within a specific range. Change-Id: I2408569e5de46d387f63cbc2fab05ea1264a505c Reviewed-on: https://gerrit.openafs.org/12640 Reviewed-by: Mark Vitale Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 50c1d1088d2adcbb37b6a9d23fdd63617b1267be Author: Marcio Barbosa Date: Mon Aug 21 15:50:14 2017 -0400 ubik: update ubik_dbVersion during SDISK_SendFile The ubik_dbVersion global represents the sync site's database version and it is mostly used by the remote sites for sanity checks. Currently, this global is updated when database changes are made on the sync site (SDISK_Commit or SDISK_SetVersion), as well as every time we vote "yes" for the sync-site in a beacon reply. Unfortunately, ubik_dbVersion is not updated when a copy of the sync site's database is received via DISK_SendFile, and it won't get updated until our next "yes" vote. During this window, the current database version will not match ubik_dbVersion. As a result, any write transaction during this time frame will fail on the remote site in question. To fix this problem, do not wait for the next beacon packet to update ubik_dbVersion when the sync site's database is received; just update it when we get the new database. Since no write transactions are allowed while the db is transferring, ubik_dbVersion can be safely updated. Change-Id: Ide7a695a69cb3229ad585d9e56c5ddc2efb76dd7 Reviewed-on: https://gerrit.openafs.org/12716 Tested-by: BuildBot Reviewed-by: Mark Vitale Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk commit ef1d4c8d328e9b9affc9864fd084257e9fa08445 Author: Andrew Deason Date: Thu Jan 11 21:27:28 2018 -0600 LINUX: Avoid locking inode in check_dentry_race Currently, check_dentry_race locks the parent inode in order to ensure it is not running in parallel with d_splice_alias for the same inode. (For old Linux kernel versions; see commit b0461f2d: "LINUX: Workaround d_splice_alias/d_lookup race".) However, it is possible to hit this area of code when the parent inode is already locked. When someone tries to create a file, directory, or symlink, Linux tries to lookup the dentry for the target path, to see if it already exists. While looking up the last component of the path, Linux locks the directory, and if it finds a dentry for the target name, it calls d_invalidate on it while the parent directory is locked. For a dentry with a NULL inode, we'll then try to lock the parent inode in check_dentry_race. But since the inode is already locked, we will deadlock. From a user's point of view, the hang can be reproduced by doing something similar to: $ mkdir dir # succeeds $ rmdir dir $ ls -l dir ls: cannot access dir: No such file or directory $ mkdir dir # hangs To avoid this, we can just change which lock we're using to avoid check_dentry_race/d_splice_alias from running in parallel. Instead of locking the parent inode, introduce a new global lock (called dentry_race_sem), and lock that in check_dentry_race and around our d_splice_alias call. We know that those are the only two users of this new lock, so this should avoid any such deadlocks. This does potentially reduce performance, since all tasks that hit check_dentry_race or d_splice_alias will take the same global lock. However, this at least still allows us to make use of negative dentries, and this entire code path only applies to older Linux kernels. It could be possible to add a new lock into struct vcache instead, but using a global lock like this commit does is much simpler. Change-Id: Ide0f21145c83d6fbb34c637d8a36c8cd21549940 Reviewed-on: https://gerrit.openafs.org/12868 Tested-by: Benjamin Kaduk Reviewed-by: Benjamin Kaduk commit f599e1ce6354c42a9c0c8f7205ba8a03c35ea72b Author: Michael Meffie Date: Wed Jan 17 17:33:50 2018 -0500 redhat: fix conditional for kernel-debuginfo files directive Commit 443dd5367e0cd9050ad39a6594c5be521271b4e9 added support for a separate debuginfo package for the kernel module. Unfortunately, the %files directive for the kernel module debuginfo package was incorrectly placed in the %if stanza of the build_userspace condition, so the rpmbuild fails when attempting to build just the kernel module. That is, when running rpmbuild with the options: rpmbuild --define "build_userspace 0" --define "build_modules 1" ... rpmbuild fails with: RPM build errors: Installed (but unpackaged) file(s) found: /usr/lib/debug/lib/modules/.../extra/openafs/openafs.ko.debug Fix this by moving the new %files directive out of the build_userspace conditional. Change-Id: I46e74b660048022a4cc4327835c6055402a34ccf Reviewed-on: https://gerrit.openafs.org/12874 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 6a2b85cd4c00a08e165cb96d2cb56bf87c6324bc Author: Michael Meffie Date: Sat Dec 30 17:59:38 2017 -0500 autoconf: refactor linux-checks.m4 Further refactoring of the autoconf macros. Divy up the linux kernel checks into smaller files. This is a non-functional change. Care has been taken preserve the ordering of the autoconf tests. Except for whitespace, the generated configure file has not been changed by this refactoring. This has been verified with a 'diff -u -w -B' comparison of the generated configure file before and after applying this commit. Change-Id: I5ea4c9e3a0aeff1767ef561bdb8361781694ee28 Reviewed-on: https://gerrit.openafs.org/12844 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 3c2e39bab7d927aa5f20d02a5e327927a4b2b553 Author: Michael Meffie Date: Sat Dec 30 12:12:59 2017 -0500 autoconf: refactor ostype.m4 Further refactoring of the autoconf macros. Move more linux and solaris specific checks into their own files. This is a non-functional change. Care has been taken preserve the ordering of the autoconf tests. Except for whitespace, the generated configure file has not been changed by this refactoring. This has been verified with a 'diff -u -w -B' comparison of the generated configure file before and after applying this commit. Change-Id: Ib3e7b1270826970c541a695230f4e3cd13cf9e3d Reviewed-on: https://gerrit.openafs.org/12843 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit c72622a244e561173e86ffe88ee3c9a8c823a76a Author: Michael Meffie Date: Fri Dec 29 14:24:28 2017 -0500 autoconf: refactor acinclude.m4 The acinclude.m4 is very large and often requires to be changed for unrelated commits. Divy up the large acinclude.m4 into a number of smaller files to avoid so many contentions and to make the autoconf system easier to maintain. This is a non-functional change. Care has been taken preserve the ordering of the autoconf tests. Except for whitespace, the generated configure file has not been changed by this refactoring. This has been verified with a 'diff -u -w -B' comparison of the generated configure file before and after applying this commit. Change-Id: I70e7f846dea0055d00a60a47422aa73bff25c4c6 Reviewed-on: https://gerrit.openafs.org/12842 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 0760feb7992e1e39f716c5f583fe7f6e85584262 Author: Benjamin Kaduk Date: Thu Jan 4 22:00:15 2018 -0600 rx: remove trailing semicolons from FBSD mutex operations Since the first introduction of FreeBSD support, the macros (MUTEX_ENTER, etc.) for kernel mutex operations have included trailing semicolons, unique among all the platforms. This did not cause problems until the recent work on rx event handlers, which put a MUTEX_ENTER() in the body of an 'if' clause with no brackets, and attempted to follow it with an 'else' clause. This results in the following (rather obtuse) compiler error: /root/openafs/src/rx/rx.c:3666:5: error: expected expression else ^ Which is more visible in the preprocessed source, as if (condition) expression;; else other_expression; is clearly invalid C. To fix the FreeBSD kernel module build, remove the unneeded semicolons. Change-Id: I191009ad412852dcc03cd71a0982fe41a953301d Reviewed-on: https://gerrit.openafs.org/12853 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit decb4308d4e18ad9f6f181e3df5f737698dba7ad Author: Benjamin Kaduk Date: Sat Dec 9 11:44:51 2017 -0600 libuafs: remove stale afs_nfsdisp.lo rule afs_nfsdisp.lo is not used, so we do not need a build rule for it. Change-Id: I4ca53a4823b0ccd5bfd769867f6766bd05ea4ceb Reviewed-on: https://gerrit.openafs.org/12802 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit e443a9fb67dbc29e6cc36661a4ac6e91af113f23 Author: Benjamin Kaduk Date: Sat Dec 9 11:37:59 2017 -0600 Replace with Our in-tree xdr.h appears to have started life as a concatenation of rpc/types.h and rpc/xdr.h, and should include all the needed functionality. Indeed, commit 7293ddf325b149cae60d3abe7199d08f196bd2b9 even indicates that we expect to be using our in-tree XDR everywhere anyway, so the system XDR is superfluous. Note that afs/sysincludes.h (not afsincludes.h!) already includes rx/xdr.h ifndef AFS_LINUX22_ENV. This change should help systems running glibc 2.26 or newer, which has stopped providing the Sun RPC headers by default. While here remove some duplicate includes of rpc/types.h in the AIX-specific sources. The Solaris NFS translator bits cannot really be changed, since the system headers are used and have tight interdependencies. Update rxgen to not emit rpc/types.h inclusion. [mmeffie: squash 12801 to not emit rpc/types.h from rxgen] Change-Id: I0b195216affa06ab9e259cb0bab0c8286a1636d9 Reviewed-on: https://gerrit.openafs.org/12800 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit afbc199f152cc06edc877333f229604c28638d07 Author: Mark Vitale Date: Thu Nov 30 20:26:46 2017 -0500 LINUX: Avoid d_invalidate() during afs_ShakeLooseVCaches() With recent changes to d_invalidate's semantics (it returns void in Linux 3.11, and always returns success in RHEL 7.4), it has become increasingly clear that d_invalidate() is not the best function for use in our best-effort (nondisruptive) attempt to free up vcaches that is afs_ShakeLooseVCaches(). The new d_invalidate() semantics always force the invalidation of a directory dentry, which contradicts our desire to be nondisruptive, especially when that directory is being used as the current working directory for a process. Our call to d_invalidate(), intended to merely probe for whether a dentry can be discarded without affecting other consumers, instead would cause processes using that dentry as a CWD to receive ENOENT errors from getcwd(). A previous commit (c3bbf0b4444db88192eea4580ac9e9ca3de0d286) tried to address this issue by calling d_prune_aliases() instead of d_invalidate(), but d_prune_aliases() does not recursively descend into children of the given dentry while pruning, leaving it an incomplete solution for our use-case. To address these issues, modify the shakeloose routine TryEvictDentries() to call shrink_dcache_parent() and maybe __d_drop() for directories, and d_prune_aliases() for non-directories, instead of d_invalidate(). (Calls to d_prune_aliases() for directories have already been removed by reverting commit c3bbf0b4444db88192eea4580ac9e9ca3de0d286.) Just like d_invalidate(), shrink_dcache_parent() has been around "forever" (since pre-git v2.6.12). Also like d_invalidate(), it "walks" the parent dentry's subdirectories and "shrinks" (unhashes) unused dentries. But unlike d_invalidate(), shrink_dcache_parent() will not unhash an in-use dentry, and has never changed its signature or semantics. d_prune_aliases() has also been available "forever", and has also never changed its signature or semantics. The lack of recursive descent is not an issue for non-directories, which cannot have such children. [kaduk@mit.edu: apply review feedback to fix locking and avoid extraneous changes, and reword commit message] Change-Id: Icb6138ee5785e0ef82a9b85b1d2651dfd0830043 Reviewed-on: https://gerrit.openafs.org/12830 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 5076dfc14b980aed310f3862875d5e9919fa199d Author: Mark Vitale Date: Thu Nov 30 17:56:13 2017 -0500 LINUX: consolidate duplicate code in osi_TryEvictDentries The two stanzas for HAVE_DCACHE_LOCK are now functionally identical; remove the preprocessor conditionals and duplicate code. Minor functional change is incurrred for very old (before 2.6.38) Linux versions that have dcache_lock; we are now obtaining the d_lock as well. This is safe because d_lock is also quite old (pre-git, 2.6.12), and it is a spinlock that's only held for checking d_unhashed. Therefore, it should have negligible performance impact. It cannot cause deadlocks or violate locking order, because spinlocks can't be held across sleeps. Change-Id: I08faf204e6bd82c4401cdf6048d12cd551dd18fc Reviewed-on: https://gerrit.openafs.org/12792 Reviewed-by: Benjamin Kaduk Reviewed-by: Andrew Deason Tested-by: BuildBot commit 0678ad26b6069040a6ea86866fb59ef5968ea343 Author: Mark Vitale Date: Thu Nov 30 16:51:32 2017 -0500 LINUX: consolidate duplicate code in canonical_dentry The two stanzas for HAVE_DCACHE_LOCK are now identical; remove the preprocessor conditionals and duplicate code. No functional change should be incurred by this commit. Change-Id: I15cd4631d1932dcfb920313acb82fcbe570087e8 Reviewed-on: https://gerrit.openafs.org/12791 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 652cd597d9b3cf1a9daccbbf6bf35f1b0cd55a94 Author: Mark Vitale Date: Thu Nov 30 16:46:16 2017 -0500 LINUX: add afs_d_alias_lock & _unlock compat wrappers Simplify some #ifdefs for HAVE_DCACHE_LOCK by pushing them down into new helpers in osi_compat.h. No functional change should be incurred by this commit. Change-Id: Ia0dc560bc84c8db4b84ddcc77a17bab5fbf93af9 Reviewed-on: https://gerrit.openafs.org/12790 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 74f4bfc627c836c12bb7c188b86d570d2afdcae8 Author: Mark Vitale Date: Thu Nov 30 16:08:38 2017 -0500 LINUX: create afs_linux_dget() compat wrapper For dentry operations that cover multiple dentry aliases of a single inode, create a compatibility wrapper to hide differences between the older dget_locked() and the current dget(). No functional change should be incurred by this commit. Change-Id: I2bb0d453417f37707018f6ba5859903c3d34c8ff Reviewed-on: https://gerrit.openafs.org/12789 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 367693bd7da2de593e3329f6acc4a4d07621fb97 Author: Mark Vitale Date: Thu Nov 30 13:45:27 2017 -0500 Revert "LINUX: do not use d_invalidate to evict dentries" Linux recently changed the semantics of d_invalidate() to: - return void - invalidate even a current working directory OpenAFS commit c3bbf0b4444db88192eea4580ac9e9ca3de0d286 switched libafs to use d_prune_aliases() instead. However, since that commit, several things have happened: - RHEL 7.4 changed the semantics of d_invalidate() such that it invalidates the cwd, but did NOT change the return type to void. This broke our autoconf test for detecting the new semantics. - Further research reveals that d_prune_aliases() was not the best choice for replacing d_invalidate(). This is because for directories, d_prune_aliases() doesn't invalidate dentries when they are referenced by its children, and it doesn't walk the tree trying to invalidate child dentries. So it can leave dentries dangling, if the only references to thos dentries are via children. In preparation for future commits, revert c3bbf0b4444db88192eea4580ac9e9ca3de0d286 . Change-Id: Iafbef23a6070180c0e21eb01a2d59385ef52f55c Reviewed-on: https://gerrit.openafs.org/12788 Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit f8247078bd33a825d8734b2c8f05120d15ab3ffd Author: Mark Vitale Date: Thu Nov 30 14:04:48 2017 -0500 Revert "LINUX: eliminate unused variable warning" This reverts commit 19599b5ef5f7dff2741e13974692fe4a84721b59 to allow also reverting commit c3bbf0b4444db88192eea4580ac9e9ca3de0d286 . Change-Id: I2780fe68d352f0f1def198f21127ec944d1d2c1d Reviewed-on: https://gerrit.openafs.org/12787 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit fb1f14d8ee963678a9caad0538256c99c159c2c4 Author: Stephan Wiesand Date: Fri Dec 22 14:40:32 2017 +0100 Linux 4.15: check for 2nd argument to pagevec_init Linux 4.15 removes the distinction between "hot" and "cold" cache pages, and pagevec_init() no longer takes a "cold" flag as the second argument. Add a configure test and use it in osi_vnodeops.c . Change-Id: Ia5287b409b2a811d2250c274579e6f15fd18fdbb Reviewed-on: https://gerrit.openafs.org/12824 Tested-by: BuildBot Reviewed-by: Marcio Brito Barbosa Tested-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk commit be5f5b2aff2d59986dd8e7dd7dd531be24c27cb2 Author: Stephan Wiesand Date: Fri Dec 22 14:17:09 2017 +0100 Linux: use plain page_cache_alloc Linux 4.15 removes the distinction between "hot" and "cold" cache pages, and no longer provides page_cache_alloc_cold(). Simply use page_cache_alloc() instead, rather than adding yet another test. Change-Id: I34e734223927030f7ff252acb61120366a808ad6 Reviewed-on: https://gerrit.openafs.org/12823 Tested-by: BuildBot Reviewed-by: Marcio Brito Barbosa Tested-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk commit 443dd5367e0cd9050ad39a6594c5be521271b4e9 Author: Pat Riehecky Date: Thu Mar 12 14:33:10 2015 -0500 redhat: separate debuginfo package for kmod rpm Place the debuginfo for the kmod into its own rpm so that it doesn't have to track against the userspace packages. FIXES 132034 Change-Id: I60a753275d896a89c1f6896c653d78a4e1fe7e2c Reviewed-on: https://gerrit.openafs.org/11867 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit fd4eaebb60dbefc27be98015fee23a3cf5d9752d Author: Christof Hanke Date: Mon Dec 18 16:58:39 2017 +0100 Avoid gcc warning When using the configure option --enable-checking with gcc 7.2.1, the compilation fails with vutil.c:860:20: error: ‘%s’ directive writing up to 255 bytes into \ a region of size 63 [-Werror=format-overflow=] This can be seen in the logs of the openSUSE Tumbleweed builder for e.g. build 2368. Avoid this warning by using snprintf which is provided by libroken for all platforms. Change-Id: I6acd3a1c06760abc8144c0892812c3bb50477227 Reviewed-on: https://gerrit.openafs.org/12813 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 6e57b22642bafb177e0931b8fb24042707d6d62f Author: Marcio Barbosa Date: Thu Oct 12 12:42:40 2017 -0300 macos: make the OpenAFS client aware of APFS Apple has introduced a new file system called APFS. Starting from High Sierra, APFS replaces Mac OS Extended (HFS+) as the default file system for solid-state drives and other flash storage devices. The current OpenAFS client is not aware of APFS. As a result, the installation of the current client into an APFS volume will panic the machine. To fix this problem, make the OpenAFS client aware of APFS. Change-Id: Ib5ac88b87f348744864f4e33f1f222efbc852d41 Reviewed-on: https://gerrit.openafs.org/12743 Tested-by: BuildBot Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit e533d0737058940d59d93467c9b4d6d3ec2834e6 Author: Marcio Barbosa Date: Fri Oct 6 10:01:12 2017 -0300 macos: packaging support for MacOS X 10.13 This commit introduces the new set of changes / files required to successfully create the dmg installer on OS X 10.13 "High Sierra". Change-Id: Id9da3cf959627a13d8cfd1d1d7412820e46ad63e Reviewed-on: https://gerrit.openafs.org/12742 Tested-by: BuildBot Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk commit 804c9cbf501d4ca91b69ad8fd6d64e49efa25a47 Author: Marcio Barbosa Date: Tue Oct 3 17:01:56 2017 -0300 macos: add support for MacOS 10.13 This commit introduces the new set of changes / files required to successfully build the OpenAFS source code on OS X 10.13 "High Sierra". Change-Id: I51928279d97c9d86c67db7de5eb7fc9d317fd381 Reviewed-on: https://gerrit.openafs.org/12741 Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit edc5463f3db4b6af2307741d9f4ee8f2c81cd98e Author: Benjamin Kaduk Date: Thu Dec 14 19:54:57 2017 -0600 Fix macro used to check kernel_read() argument order The m4 macro implementing the configure check is called LINUX_KERNEL_READ_OFFSET_IS_LAST, but it defines a preprocessor symbol that is just KERNEL_READ_OFFSET_IS_LAST. Our code needs to check for the latter being defined, not the former. Reported by Aaron Ucko. Change-Id: Id7cd3245b6a8eb05f83c03faee9c15bab8d0f6e8 Reviewed-on: https://gerrit.openafs.org/12808 Reviewed-by: Anders Kaseorg Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 894555f93a2571146cb9ca07140eb98c7a424b01 Author: Benjamin Kaduk Date: Mon Dec 4 17:20:57 2017 -0600 OPENAFS-SA-2017-001: rx: Sanity-check received MTU and twind values Rather than blindly trusting the values received in the (unauthenticated) ack packet trailer, apply some minmial sanity checks to received values. natMTU and regular MTU values are subject to Rx minmium/maximum packet sizes, and the transmit window cannot drop below one without risk of deadlock. The maxDgramPackets value that can also be present in the trailer already has sufficient sanity checking. Extremely low MTU values (less than 28 == RX_HEADER_SIZE) can cause us to set a negative "maximum usable data" size that gets used as an (unsigned) packet length for subsequent allocation and computation, triggering an assertion when the connection is used to transmit data. FIXES 134450 Change-Id: I37698ff166da47a57aa0d1962ae8effc74e30851 commit 4fa0ee620cfb9991ca9748b5ee116cc8e1e6c505 Author: Benjamin Kaduk Date: Mon Nov 27 22:17:28 2017 -0600 afs: Fix bounds check in PNewCell Reported by the opensuse buildbot: CC [M] /home/buildbot/opensuse-tumbleweed-i386-builder/build/src/libafs/MODLOAD-4.13.12-1-default-MP/rx_packet.o /home/buildbot/opensuse-tumbleweed-i386-builder/build/src/afs/afs_pioctl.c: In function ‘PNewCell’: /home/buildbot/opensuse-tumbleweed-i386-builder/build/src/afs/afs_pioctl.c:3075:55: error: ‘*’ in boolean context, suggest ‘&&’ instead [-Werror=int-in-bool-context] if ((afs_pd_remaining(ain) < AFS_MAXCELLHOSTS +3) * sizeof(afs_int32)) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~ The bug was introduced in commit 718f85a8b6. Change-Id: Iae55a99e35266aa763fb431f2acc4eba09fa5357 Reviewed-on: https://gerrit.openafs.org/12782 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 66b74e78ba5fea6a8236dcd3b8b46e1dfa6a0ac7 Author: Benjamin Kaduk Date: Mon Nov 27 22:07:53 2017 -0600 rx: fix call refcount leak in error case The recent event handling normalization in commit 304d758983b499dc568d6ca57b6e92df24b69de8 had event handlers switch to dropping their reference on the associated connection/call just before return. An early return case was missed in the conversion, leading to a refcount leak in an error case. Change-Id: Ie3d0bc9474fdbc09be9c753f4d0192c8cca68351 Reviewed-on: https://gerrit.openafs.org/12781 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 3ce55426ee6912b78460465bcaa1428333ad1fbc Author: Marcio Barbosa Date: Thu Nov 16 17:24:03 2017 -0500 afs: fix kernel_write / kernel_read arguments The order / content of the arguments passed to kernel_write and kernel_read are not right. As a result, the kernel will panic if one of the functions in question is called. [kaduk@mit.edu: include configure check for multiple kernel_read() variants, per linux commits bdd1d2d3d251c65b74ac4493e08db18971c09240 and e13ec939e96b13e664bb6cee361cc976a0ee621a] FIXES 134440 Change-Id: I4753dee61f1b986bbe6a12b5568d1a8db30c65f8 Reviewed-on: https://gerrit.openafs.org/12769 Tested-by: BuildBot Tested-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk commit 50a3eb7b7ee94bffaadc98429bd404164e89ec7f Author: Michael Meffie Date: Mon Nov 6 17:37:46 2017 -0500 tests: fix out of bounds access in the rx-event test Use the NUMEVENTS symbol which defines the array size instead of an incorrect hard coded number when checking if a second event can be added to be fired at the same time. This fixes a potential out of bounds access of the event test array. Also update the comment which incorrectly mentions the incorrect number of events in the test. Change-Id: I4f993b42e53e7e6a42fa31302fd1baa70e9f5041 Reviewed-on: https://gerrit.openafs.org/12762 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 2ae84bf053fe66b73a2c77b5d71305bae2c17587 Author: Benjamin Kaduk Date: Thu Nov 16 04:49:49 2017 -0600 Sprinkle rx_GetConnection() for concision Instead of inlining the body (taking the lock, incrementing the refcount, and dropping the lock), use the convenience function designed for this purpose. Change-Id: I674d389e61e42710ef340e202992748e66c5e763 Reviewed-on: https://gerrit.openafs.org/12772 Reviewed-by: Mark Vitale Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 01bcfd3e14f6ee1faa4b8ce5a7932de37d585fd3 Author: Benjamin Kaduk Date: Thu Nov 16 04:48:02 2017 -0600 rx: fix mutex leak in error case Reported by Mark Vitale Change-Id: I3269fbb0f87285bcb9af64f4ad81791177582e6d Reviewed-on: https://gerrit.openafs.org/12771 Reviewed-by: Mark Vitale Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit a7a3108e602c83176c5578c9f28b6312f71aba78 Author: Benjamin Kaduk Date: Tue Oct 31 19:49:09 2017 -0500 Add event-related mutex assertions In utility functions that access fields of type struct rxevent *, assert that the appropriate lock is held for the access in question. These assertions are only compiled in when built with -DOPR_DEBUG_LOCKS, which can be enbled by --debug-locks at configure time. Change-Id: I16885a4d37a0f094f0d365c54e8157ed92070c69 Reviewed-on: https://gerrit.openafs.org/12757 Reviewed-by: Mark Vitale Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 304d758983b499dc568d6ca57b6e92df24b69de8 Author: Benjamin Kaduk Date: Sat Oct 7 22:42:38 2017 -0500 Standardize rx_event usage Go over all consumers of the rx event framework and normalize its usage according to the following principles: rxevent_Post() is used to create an event, and it returns an event handle (with a reference on the event structure) that can be used to cancel the event before its timeout fires. (There is also an additional reference on the event held by the global event tree.) In all(*) usage within the tree, that event handle is stored within either an rx_connection or an rx_call. Reads/writes to the member variable that holds the event handle require either the conn_data_lock or call lock, respectively -- that means that in most cases, callers of rxevent_Post() and rxevent_Cancel() will be holding one of those aforementioned locks. The event handlers themselves will need to modify the call/connection object according to the nature of the event, which requires holding those same locks, and also a guarantee that the call/connection is still a live object and has not been deallocated! Whether or not rxevent_Cancel() succeeds in cancelling the event before it fires, whenever passed a non-NULL event structure it will NULL out the supplied pointer and drop a reference on the event structure. This is the correct behavior, since the caller has asked to cancel the event and has no further use for the event handle or its reference on the event structure. The caller of rxevent_Cancel() must check its return value to know whether or not the event was cancelled before its handler was able to run. The interaction window between the call/connection lock and the lock protecting the red/black tree of pending events opens up a somewhat problematic race window. Because the application thread is expected to hold the call/connection lock around rxevent_Cancel() (to protect the write to the field in the call/connection structure that holds an event handle), and rxevent_Cancel() must take the lock protecting the red/black tree of events, this establishes a lock order with the call/connection lock taken before the eventTree lock. This is in conflict with the event handler thread, which must take the eventTree lock first, in order to select an event to run (and thus know what additional lock would need to be taken, by virtue of what handler function is to be run). The conflict is easy to resolve in the standard way, by having a local pointer to the event that is obtained while the event is removed from the red/black tree under the eventTree lock, and then the eventTree lock can be dropped and the event run based on the local variable referring to it. The race window occurs when the caller of rxevent_Cancel() holds the call/connection lock, and rxevent_Cancel() obtains the eventTree lock just after the event handler thread drops it in order to run the event. The event handler function begins to execute, and immediately blocks trying to obtain the call/connection lock. Now that rxevent_Cancel() has the eventTree lock it can proceed to search the tree, fail to find the indicated event in the tree, clear out the event pointer from the call/connection data structure, drop its caller's reference to the event structure, and return failure (the event was not cancelled). Only then does the caller of rxevent_Cancel() drop the call/connection lock and allow the event handler to make progress. This race is not necessarily problematic if appropriate care is taken, but in the previous code such was not the case. In particular, it is a common idiom for the firing event to call rxevent_Put() on itself, to release the handle stored in the call/connection that could have been used to cancel the event before it fired. Failing to do so would result in a memory leak of event structures; however, rxevent_Put() does not check for a NULL argument, so a segfault (NULL dereference) was observed in the test suite when the race occurred and the event handler tried to rxevent_Put() the reference that had already been released by the unsuccessful rxevent_Cancel() call. Upon inspection, many (but not all) of the uses in rx.c were susceptible to a similar race condition and crash. The test suite also papers over a related issue in that the event handler in the test suite always knows that the data structure containing the event handle will remain live, since it is a global array that is allocated for the entire scope of the test. In rx.c, events are associated with calls and connections that have a finite lifetime, so we need to take care to ensure that the call/connection pointer stored in the event remains valid for the duration of the event's lifecycle. In particular, even an attempt to take the call/connection lock to check whether the corresponding event field is NULL is fraught with risk, as it could crash if the lock (and containing call/connection) has already been destroyed! There are several potential ways to ensure the liveness of the associated call/connection while the event handler runs, most notably to take care in the call/connection destruction path to ensure that all associated events are either successfully cancelled or run to completion before tearing down the call/connection structure, and to give the pending event its own reference on the associated call/connection. Here, we opt for the latter, acknowledging that this may result in the event handler thread doing the full call/connection teardown and delay the firing of subsequent events. This is deemed acceptable, as pending events are for intentionally delayed tasks, and some extra delay is probably acceptable. (The various keepalive events and the challenge event could delay the user experience and/or security properties if significantly delayed, but I do not believe that this change admits completely unbounded delay in the event handler thread, so the practical risk seems minimal.) Accordingly, this commit attempts to ensure that: * Each event holds a formal reference on its associated call/connection. * The appropriate lock is held for all accesses to event pointers in call/connection structures. * Each event handler (after taking the appropriate lock) checks whether it raced with rxevent_Cancel() and only drops the call/connection's reference to the event if the race did not occur. * Each event handler drops its reference to the associated call/connection *after* doing any actions that might access/modify the call/connection. * The per-event reference on the associated call/connection is dropped by the thread that removes the event from the red/black tree. That is, the event handler function if the event runs, or by the caller of rxevent_Cancel() when the cancellation succeed. * No non-NULL event handles remain in a call/connection being destroyed, which would indicate a refcounting error. (*) There is an additional event used in practice, to reap old connections, but it is effectively a background task that reschedules itself periodically, with no handle to the event retained so as to be able to cancel it. As such, it is unaffected by the concerns raised here. While here, standardize on the rx_GetConnection() function for incrementing the reference count on a connection object, instead of inlining the corresponding mutex lock/unlock and variable access. Also enable refcount checking unconditionally on unix, as this is a rather invasive change late in the 1.8.0 release process and we want to get as much sanity checking coverage as possible. Change-Id: I27bcb932ec200ff20364fb1b83ea811221f9871c Reviewed-on: https://gerrit.openafs.org/12756 Reviewed-by: Mark Vitale Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit bdb509fb1d8e0fdca05dffecdbcbf60a95ea502e Author: Benjamin Kaduk Date: Wed Oct 4 23:03:44 2017 -0500 Adjust rx-event test to exercise cancel/fire race We currently do not properly handle the case where a thread runs rxevent_Cancel() in parallel with the event-handler thread attempting to fire that event, but the test suite only picked up on this issue in a handful of the Debian automated builds (somewhat less-resourced ones, perhaps). Modify the event scheduling algorithm in the test so as to create a larger chunk of events scheduled to fire "right away" and thereby exercise the race condition more often when we proceed to cancel a quarter of events "right away". Change-Id: I50f55fd532901147cfda1a5f40ef949bf3270401 Reviewed-on: https://gerrit.openafs.org/12755 Tested-by: BuildBot Reviewed-by: Mark Vitale Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk commit 311f1d28a2f626350b33ad432e674055b62511bd Author: Michael Laß Date: Thu Nov 2 21:16:49 2017 +0100 gtx: link against libtinfo if termlib is seperated If ncurses is built with "./configure --with-termlib=tinfo", gtx fails to link because of an undefined reference to the LINES symbol which is then provided by libtinfo.so and not libncurses.so. If ncurses is present, additionally check whether LINES is provided by ncurses or tinfo and set $LIB_curses accordingly. This change is based on a patch provided by Bastian Beischer. FIXES 134420 Change-Id: I3e29c61405d90d0b850bafe4c51125bef433452b Reviewed-on: https://gerrit.openafs.org/12760 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit e0c5ada214596d5adb6798682d5e280cc99f447c Author: Benjamin Kaduk Date: Mon Oct 16 16:53:22 2017 -0500 Correct m4 conditionals in curses.m4 AS_IF does not invoke the test(1) shell builtin for us, so we must take care to consistently use it ourself. While here, sprinkle some missing double-quotes around variable expansions in AS_IF statements in this file. Submitted by Bastian Beischer. FIXES 134414 Change-Id: Iccfe311011f17de6317cf64abdc58b0812b81b8c Reviewed-on: https://gerrit.openafs.org/12738 Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk commit 5ee516b3789d3545f3d78fb3aba2480308359945 Author: Damien Diederen Date: Mon Sep 18 12:18:39 2017 +0200 Linux: Use kernel_read/kernel_write when __vfs variants are unavailable We hide the uses of set_fs/get_fs behind a macro, as those functions are likely to soon become unavailable: > Christoph Hellwig suggested removing all calls outside of the core > filesystem and architecture code; Andy Lutomirski went one step > further and said they should all go. https://lwn.net/Articles/722267/ Change-Id: Ib668f8fdb62ca01fe14321c07bd14d218744d909 Reviewed-on: https://gerrit.openafs.org/12729 Tested-by: BuildBot Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit a71288a387095ccb4be83c1abae34ada80f53185 Author: Michael Meffie Date: Fri Jul 21 22:30:43 2017 -0400 redhat: avoid rpmbuild exclude directives Older versions of rpmbuild do not support the files exclude directive, so fall back to the old way in which we remove the files to be excluded and list the files to be included. Change-Id: If64df382ef372aa1078f1703a34942a1930bdc88 Reviewed-on: https://gerrit.openafs.org/12733 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 4d247e1ae446c512031511273d556ef1fd32dca1 Author: Michael Meffie Date: Fri Jul 21 22:16:44 2017 -0400 redhat: move .krb variants to the kauth-client subpackage Move the deprecated klog.krb, pagsh.krb, and tokens.krb programs and man pages to the optional openafs-kauth-client subpackage. Change-Id: I09a2e36b60f9d47726a6a314a26db88e44575567 Reviewed-on: https://gerrit.openafs.org/12732 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 671db4ca5a76625d9b7133510cc1cbdda8a5d9b9 Author: Michael Meffie Date: Thu Jul 20 04:13:04 2017 -0400 redhat: specify man pages without wildcards Currently, some of the man pages are specified with the full name and some are specified with a wildcard for the filename extension. Instead, specify all the man pages without a wildcards to be more consistent and to avoid putting incorrect man pages in packages. This change removes a stray copy the klog.krb5.1 man page from openafs-kauth-client subpackage and moves the AuthLog/AuthLog.dir man pages to the optional openafs-kauth-server subpackage. Change-Id: Id30a6174c532a9a00f850d6ca2722158293d5118 Reviewed-on: https://gerrit.openafs.org/12731 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit a9810b829bdccfed4d1718b11cf4dd51f9565e00 Author: Michael Meffie Date: Fri Jul 21 18:05:48 2017 -0400 redhat: remove afsd.fuse man page The afsd.fuse binary is not currently packaged; do not package the man page. Change-Id: Ia0dd4fa72dc8a87e2c835798b6fbe1213d71da5f Reviewed-on: https://gerrit.openafs.org/12730 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 68ec78950a6e39dc1bf15012d4b889728086d0b7 Author: Marcio Barbosa Date: Mon Aug 21 14:21:54 2017 -0400 ubik: avoid DISK_Begin on sites that didn't vote for sync As already described on 7c708506, SDISK_Begin fails on remotes if lastYesState is not set. To fix this problem, 7c708506 does not allow write transactions until we know that lastYesState is set on at least quorum (ubik_syncSiteAdvertised == 1). In other words, if enough sites received a beacon packet informing that a sync-site was elected, write transactions will be allowed. This means that ubik_syncSiteAdvertised can be true while lastYesState is not set in a few sites. Consider the following scenario in a cell with frequent write transactions: Site A => Sync-site (up) Site B => Remote 1 (up) Site C => Remote 2 (down - unreachable) Since A and B are up, we have quorum. After the second wave of beacons, ubik_syncSiteAdvertised will be true and write transactions will be allowed. At some point, C is not unreachable anymore. Site A sends a copy of its database to C, but C did not vote for A yet (lastYesState == 0). A new write transaction is initialized and, since lastYesState is not set on C, DISK_Begin fails on this remote site and C is marked as down. Since C is reachable, A will mark this remote site as up. The sync-site will send its database to C, but C did not vote for A yet. A new write transaction is initialized and, since lastYesState is not set on C, DISK_Begin fails on this remote site and C is marked as down. In a cell with frequent write transactions, this cycle will repeat forever. As a result, the sync-site will be constantly sending its database to C and quorum will be operating with less sites, increasing the chances of re-elections. To fix this problem, do not call DISK_Begin on remotes that did not vote for the sync-site yet. Change-Id: I27f5122a089064e7b83beba3533261d8a4e31c64 Reviewed-on: https://gerrit.openafs.org/12715 Tested-by: BuildBot Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk commit 929e77a886fc9853ee292ba1aa52a920c454e94b Author: Damien Diederen Date: Mon Sep 18 11:59:40 2017 +0200 Linux: Test for __vfs_write rather than __vfs_read The following commit: commit eb031849d52e61d24ba54e9d27553189ff328174 Author: Christoph Hellwig Date: Fri Sep 1 17:39:23 2017 +0200 fs: unexport __vfs_read/__vfs_write unexports both __vfs_read and __vfs_write, but keeps the former in fs.h--as it is is still being used by another part of the tree. This situation results in a false positive in our Autoconf check, which does not see the export statements, and ends up marking the corresponding API as available. That, in turn, causes some code which assumes symmetry with __vfs_write to fail to compile. Switch to testing for __vfs_write, which correctly marks the API as unavailable. Change-Id: I392f2b17b4de7bd81d549c84e6f7b5ef05e1b999 Reviewed-on: https://gerrit.openafs.org/12728 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk commit 0a9a6b57ce6e1c97fcc651c8cb74e66fc8422a1e Author: Anders Kaseorg Date: Fri Sep 1 23:37:07 2017 -0400 vol: Fix two buffers being one char too short Fixes these warnings: namei_ops.c: In function 'namei_copy_on_write': namei_ops.c:1328:31: warning: 'snprintf' output may be truncated before the last format character [-Wformat-truncation=] snprintf(path, sizeof(path), "%s-tmp", name.n_path); ^~~~~~~~ namei_ops.c:1328:2: note: 'snprintf' output between 5 and 260 bytes into a destination of size 259 snprintf(path, sizeof(path), "%s-tmp", name.n_path); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ vol_split.c: In function 'split_volume': vol_split.c:576:22: warning: 'sprintf' may write a terminating nul past the end of the destination [-Wformat-overflow=] sprintf(symlink, "#%s", V_name(newvol)); ^~~~~ vol_split.c:576:5: note: 'sprintf' output between 2 and 33 bytes into a destination of size 32 sprintf(symlink, "#%s", V_name(newvol)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Change-Id: If212ebc29fa3fe10fe1e2f70dfb5f7509c269ae9 Reviewed-on: https://gerrit.openafs.org/12722 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot commit 962f4838dc461567d896304f617a0923745d13d5 Author: Seth Forshee Date: Tue Aug 22 07:59:11 2017 -0500 Linux: Include linux/uaccess.h rather than asm/uaccess.h if present Starting with Linux 4.12 there is a module build error on s390 due to asm/uaccess.h using a macro defined in the common header. The common header has been around since 2.6.18 and has always included asm/uaccess.h, so switch to using the common header whenever it is present. Change-Id: Iaab0d7652483a2a2b1f144f3e90b6d3b902c146d Signed-off-by: Seth Forshee Reviewed-on: https://gerrit.openafs.org/12714 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot