Saturday 12 June 2021

hadoop commands cheat sheet

 Hadoop  commands cheat sheet | HDFS commands cheat sheet

There are many more commands in "$HADOOP_HOME/bin/hadoop fs" than are demonstrated here, use hadoop or hdfs for the commands

hadoop fs -ls <path> list files in the path of the file system

hadoop fs -chmod <arg> <file-or-dir> alters the permissions of a file where <arg> is the binary argument e.g. 777

hadoop fs -chown <owner>:<group> <file-or-dir> change the owner of a file

hadoop fs -mkdir <path> make a directory on the file system

hadoop fs -put <local-origin> <destination> copy a file from the local storage onto file system

hadoop fs -get <origin> <local-destination> copy a file to the local storage from the file system

hadoop fs -copyFromLocal <local-origin> <destination> similar to the put command but the source is restricted to a local file reference

hadoop fs -copyToLocal <origin> <local-destination> similar to the get command but the destination is restricted to a local file reference

hadoop fs -touchz create an empty file on the file system

hadoop fs -cat <file> copy files to stdout

-------------------------------------------------------------------------

"<path>" means any file or directory name. 

"<path>..." means one or more file or directory names. 

"<file>" means any filename. 

----------------------------------------------------------------


-ls <path>

Lists the contents of the directory specified by path, showing the names, permissions, owner, size and modification date for each entry.

-lsr <path>

Behaves like -ls, but recursively displays entries in all subdirectories of path.

-du <path>

Shows disk usage, in bytes, for all the files which match path; filenames are reported with the full HDFS protocol prefix.

-dus <path>

Like -du, but prints a summary of disk usage of all files/directories in the path.

-mv <src><dest>

Moves the file or directory indicated by src to dest, within HDFS.

-cp <src> <dest>

Copies the file or directory identified by src to dest, within HDFS.

-rm <path>

Removes the file or empty directory identified by path.

-rmr <path>

Removes the file or directory identified by path. Recursively deletes any child entries (i.e., files or subdirectories of path).

-put <localSrc> <dest>

Copies the file or directory from the local file system identified by localSrc to dest within the DFS.

-copyFromLocal <localSrc> <dest>

-moveFromLocal <localSrc> <dest>

Copies the file or directory from the local file system identified by localSrc to dest within HDFS, and then deletes the local copy on success.

-cat <filen-ame>

Displays the contents of filename on stdout.

-mkdir <path>

Creates a directory named path in HDFS.

Creates any parent directories in path that are missing (e.g., mkdir -p in Linux).

-setrep [-R] [-w] rep <path>

Sets the target replication factor for files identified by path to rep. (The actual replication factor will move toward the target over time)

-touchz <path>

Creates a file at path containing the current time as a timestamp. Fails if a file already exists at path, unless the file is already size 0.

-test -[ezd] <path>

Returns 1 if path exists; has zero length; or is a directory or 0 otherwise.

-stat [format] <path>

Prints information about path. Format is a string which accepts file size in blocks (%b), filename (%n), block size (%o), replication (%r), and modification date (%y, %Y).

-chmod [-R] mode,mode,... <path>...

Changes the file permissions associated with one or more objects identified by path.... Performs changes recursively with R. mode is a 3-digit octal mode, or {augo}+/-{rwxX}. Assumes if no scope is specified and does not apply an umask.

-chown [-R] [owner][:[group]] <path>...

Sets the owning user and/or group for files or directories identified by path.... Sets owner recursively if -R is specified.

-chgrp [-R] group <path>...

Sets the owning group for files or directories identified by path.... Sets group recursively if -R is specified.


LIST FILES

hdfs dfs -ls / ==>>List all the files/directories for the given hdfs destination path.

hdfs dfs -ls -d /hadoop ==> Directories are listed as plain files. In this case, this command will list

the details of hadoop folder.

hdfs dfs -ls -h /data ==>Format file sizes in a human-readable fashion (eg 64.0m instead of

67108864).

hdfs dfs -ls -R /hadoop ==>Recursively list all files in hadoop directory and all subdirectories in

hadoop directory.

hdfs dfs -ls /hadoop/dat* ==>List all the files matching the pattern. In this case, it will list all the

files inside hadoop directory which starts with 'dat'.

OWNERSHIP

hdfs dfs -checksum /hadoop/file1  ==>Dump checksum information for files that match the file pattern <src>

to stdout.

hdfs dfs -chmod 755 /hadoop/file1  ==> Changes permissions of the file.

hdfs dfs -chmod -R 755 /hadoop  ==> Changes permissions of the files recursively.

hdfs dfs -chown myuser:mygroup /hadoop  ==> Changes owner of the file. 1st ubuntu in the command is owner and

2nd one is group.

hdfs dfs -chown -R hadoop:hadoop /hadoop ==> Changes owner of the files recursively.

hdfs dfs -chgrp ubuntu /hadoop ==> Changes group association of the file.

hdfs dfs -chgrp -R ubuntu /hadoop ==> Changes group association of the files recursively.

No comments:

Post a Comment