[root@172 ~]# su hadoop[hadoop@172 root]$ cd /usr/local/service/hadoop[hadoop@172 hadoop]$
Hello World.this is a message.this is another message.Hello world, how are you?
scp $localfile root@公网IP地址:$remotefolder
/usr/local/service/hadoop
路径下。[hadoop@172 hadoop]$ ls –l
[hadoop@172 hadoop]$ hadoop fs -put /usr/local/service/hadoop/test.txt /user/hadoop/
[hadoop@172 hadoop]$ hadoop fs -ls /user/hadoop输出:-rw-r--r-- 3 hadoop supergroup 85 2018-07-06 11:18 /user/hadoop/test.txt
/user/hadoop
文件夹,用户可以自己创建,指令如下:[hadoop@172 hadoop]$ hadoop fs –mkdir /user/hadoop
[hadoop@10 hadoop]$ hadoop fs -ls cosn://$bucketname/ test.txt-rw-rw-rw- 1 hadoop hadoop 1366 2017-03-15 19:09 cosn://$bucketname/test.txt
[hadoop@10 hadoop]$ hadoop fs -put test.txt cosn://$bucketname /[hadoop@10 hadoop]$ hadoop fs -ls cosn:// $bucketname / test.txt-rw-rw-rw- 1 hadoop hadoop 1366 2017-03-15 19:09 cosn://$bucketname / test.txt
D://mavenWorkplace
中,输入如下命令新建一个 Maven 工程:mvn archetype:generate -DgroupId=$yourgroupID -DartifactId=$yourartifactID-DarchetypeArtifactId=maven-archetype-quickstart
$yourgroupID
即为您的包名;$yourartifactID
为您的项目名称;maven-archetype-quickstart
表示创建一个 Maven Java 项目,工程创建过程中需要下载一些文件,请保持网络通畅。D://mavenWorkplace
目录下就会生成一个名为$yourartifactID
的工程文件夹。其中的文件结构如下所示:simple---pom.xml 核心配置,项目根下---src---main---java Java 源码目录---resources Java 配置文件目录---test---java 测试源码目录---resources 测试配置目录
<dependencies><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-common</artifactId><version>2.7.3</version></dependency><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-mapreduce-client-core</artifactId><version>2.7.3</version></dependency></dependencies>
<build><plugins><plugin><groupId>org.apache.maven.plugins</groupId><artifactId>maven-compiler-plugin</artifactId><configuration><source>1.8</source><target>1.8</target><encoding>utf-8</encoding></configuration></plugin><plugin><artifactId>maven-assembly-plugin</artifactId><configuration><descriptorRefs><descriptorRef>jar-with-dependencies</descriptorRef></descriptorRefs></configuration><executions><execution><id>make-assembly</id><phase>package</phase><goals><goal>single</goal></goals></execution></executions></plugin></plugins></build>
import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import org.apache.hadoop.util.GenericOptionsParser;import java.io.IOException;import java.util.StringTokenizer;/*** Created by tencent on 2018/7/6.*/public class WordCount {public static class TokenizerMapperextends Mapper<Object, Text, Text, IntWritable>{private static final IntWritable one = new IntWritable(1);private Text word = new Text();public void map(Object key, Text value, Mapper<Object, Text, Text, IntWritable>.Context context)throws IOException, InterruptedException{StringTokenizer itr = new StringTokenizer(value.toString());while (itr.hasMoreTokens()){this.word.set(itr.nextToken());context.write(this.word, one);}}}public static class IntSumReducerextends Reducer<Text, IntWritable, Text, IntWritable>{private IntWritable result = new IntWritable();public void reduce(Text key, Iterable<IntWritable> values, Reducer<Text, IntWritable, Text, IntWritable>.Context context)throws IOException, InterruptedException{int sum = 0;for (IntWritable val : values) {sum += val.get();}this.result.set(sum);context.write(key, this.result);}}public static void main(String[] args)throws Exception{Configuration conf = new Configuration();String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();if (otherArgs.length < 2){System.err.println("Usage: wordcount <in> [<in>...] <out>");System.exit(2);}Job job = Job.getInstance(conf, "word count");job.setJarByClass(WordCount.class);job.setMapperClass(TokenizerMapper.class);job.setCombinerClass(IntSumReducer.class);job.setReducerClass(IntSumReducer.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(IntWritable.class);for (int i = 0; i < otherArgs.length - 1; i++) {FileInputFormat.addInputPath(job, new Path(otherArgs[i]));}FileOutputFormat.setOutputPath(job, new Path(otherArgs[(otherArgs.length - 1)]));System.exit(job.waitForCompletion(true) ? 0 : 1);}}
mvn package
scp $jarpackage root@公网IP地址: /usr/local/service/hadoop
$jarpackage
是您的本地 jar 包的路径加名称;root 为 CVM 服务器用户名;公网 IP 地址可以在 EMR 控制台的节点信息中或者在云服务器控制台查看。这里上传到了 EMR 集群的 /usr/local/service/hadoop
文件夹下。/usr/local/service/hadoop
目录,和数据准备中一样。通过如下命令来提交任务:[hadoop@10 hadoop]$ bin/hadoop jar/usr/local/service/hadoop/WordCount-1.0-SNAPSHOT-jar-with-dependencies.jarWordCount /user/hadoop/test.txt /user/hadoop/WordCount_output
/user/hadoop/ test.txt
为输入的待处理文件,/user/hadoop/ WordCount_output
为输出文件夹,在提交命令前要保证WordCount_output
文件夹尚未创建,否则提交会出错。[hadoop@172 hadoop]$ hadoop fs -ls /user/hadoop/WordCount_outputFound 2 items-rw-r--r-- 3 hadoop supergroup 0 2018-07-06 11:35 /user/hadoop/MEWordCount_output/_SUCCESS-rw-r--r-- 3 hadoop supergroup 82 2018-07-06 11:35 /user/hadoop/MEWordCount_output/part-r-00000
[hadoop@172 hadoop]$ hadoop fs -cat /user/hadoop/MEWordCount_output/part-r-00000Hello 2World. 1a 1another 1are 1how 1is 2message. 2this 2world, 1you? 1……
/usr/local/service/hadoop
目录。通过如下命令来提交任务:[hadoop@10 hadoop]$ hadoop jar/usr/local/service/hadoop/WordCount-1.0-SNAPSHOT-jar-with-dependencies.jarWordCount cosn://$bucketname/test.txt cosn://$bucketname /WordCount_output
cosn:// $bucketname/ test.txt
,其中 $bucketname 为您的存储桶名字加路径。处理结果同样也输出到 COS 中。使用如下指令查看输出文件:[hadoop@10 hadoop]$ hadoop fs -ls cosn:// $bucketname /WordCount_outputFound 2 items-rw-rw-rw- 1 hadoop Hadoop 0 2018-07-06 10:34 cosn://$bucketname /WordCount_output/_SUCCESS-rw-rw-rw- 1 hadoop Hadoop 1306 2018-07-06 10:34 cosn://$bucketname /WordCount_output/part-r-00000
[hadoop@10 hadoop]$ hadoop fs -cat cosn:// $bucketname /WordCount_output1/part-r-00000Hello 2World. 1a 1another 1are 1how 1is 2message. 2this 2world, 1you? 1
本页内容是否解决了您的问题?