EssayGhost Assignment代写,Essay代写,网课代修,Quiz代考

EssayGhost-Essay代写,作业代写,网课代修代上,cs代写代考

一站式网课代修,论文代写

高质量的Assignment代写、Paper代写、Report代写服务

EG1hao
网课代修代上,cs代写代考
Python代写
您的位置: 主页 > 编程案例 > Python代写 >
代做Python:MapReduce代写 编程模型代写 python代写 data structure代写 Project代写 - Python代做
发布时间:2021-07-25 22:12:16浏览次数:
After the first Worker registers with the Master, the Master should check the job queue (described later) if it has any work it can assign to the Worker (because a job could have arrived at the Master before any Workers registered). If the Master is already executing a map/group/reduce, it can wait until the next phase or wait until completion of the complete current task to assign the Worker any tasks.At this point, you should be able to pass test_master_1 and test_worker_1 .New Job Request [Master]In the event of a new job, the Master will receive the following message on its main TCP socket:{"message_type": "new_master_job", "input_directory": string, "output_directory": string, "mapper_executable": string, "reducer_executable": string, "num_mappers" : int, "num_reducers" : int}In response to a job request, the Master will create a set of new directories where all of the temporary files for the job will go, of the form tmp/job-{id} , where id is the current job counter (starting at 0 just like all counters). The directory structure will resemble this example (you should create 4 new folders for each job):tmpjob-0/mapper-output/ grouper-output/ reducer-output/job-1/mapper-output/ grouper-output/ reducer-output/Remember, each MapReduce job occurs in 3 phases: mapping, grouping, reducing. Workers will do the mapping and reducing using the given executable files independently, but the Master and Workers will have to cooperate to do the grouping phase. After the directories are setup, the Master should check if there are any Workers ready to work and check whether the MapReduce server is currently executing a job. If the server is busy, or there are no available Workers, the job should be added to an internal queue (described next) and end the function execution. If there are workers and the server is not busy, than the Master can begin job execution.MapReduce代写At this point, you should be able to pass test_master_2 .Job Queue [Master]If a Master receives a new job while it is already executing one or when there were no ready workers, it should accept the job, create the directories, and store the job in an internal queue until the current one has finished. Note that this means that the current job’s map, group, and reduce tasks must be complete before the next job’s map phase can begin. As soon as a job finishes, theMaster should process the next pending job if there is one (and if there are ready Workers) by starting its Map stage. For simplicity, in this project, your MapReduce server will only execute one MapReduce task at any time.As noted earlier, when you see the first Worker register to work, you should check the job queue for pending jobs.Input Partitioning [Master]To start off the Map Stage, the Master should scan the input directory and partition the input files in ‘X’ parts (where ‘X’ is the number of map tasks specified in the incoming job). After partitioning the input, the Master needs to let each Worker know what work it is responsible for. Each Worker could get zero, one, or many such tasks. The Master will send a JSON message of the following form to each Worker (on each Worker’s specific TCP socket), letting them know that they have work to do:{"message_type": "new_worker_job", "input_files": [list of strings], "executable": string, "output_directory": string "worker_pid": int}Consider the case where there are 2 Workers available, 5 input files and 4 map tasks specified. The master should create 4 tasks, 3 with one input file each and 1 with 2 input files. It would then attempt to balance these tasks among all the workers. In this case, it would send 2 map tasks to each worker. The master does not need to wait for a done message before it assigns more tasks to a Worker a Worker should be able to handle multiple tasks at the same time.Mapping [Workers]MapReduce代写When a worker receives this new job message, its handle_msg will start execution of the given executable over the specified input file, while directing the output to the given output_directory (one output file per input file and you should run the executable on each input file). The input is passed to the executable through standard in and is outputted to a specific file. The output file names should be the same as the input file (overwrite file if it already exists). The output_directory in the Map stage will always be the mapper-output folder (i.e. tmp/job-{id}/mapper-output/ ).For example, the Master should specify the input file is data/input/file_001.txt and the output filetmp/job-0/mapper-output/file_001.txtHint: See the command line package sh listed in the Libraries section. See sh.Command( ) , and the _in and _out arguments in order to funnel the input and output easily.The Worker should be agnostic to map or reduce jobs. Regardless of the type of operation, the Worker is responsible for running the specified executable over the input files one by one, and piping to the output directory for each input file. Once a Worker has finished its job, it should send a TCP message to the Master’s main socket of the form:{"message_type": "status", "output_files" : [list of strings], "status": "finished""worker_pid": int}At this point, you should be able to pass test_worker_3 , test_worker_4 , test_worker_5 .Grouping [Master + Workers]MapReduce代写Once all of the mappers have finished, the Master will start the “grouping” phase. This should begin right after the LAST Worker finishes the Map stage (i.e. you will get a finished message from the last Worker and the handle_msg handling that message will continue this grouping stage).To start the group stage, the Master looks at all of the files created by the mappers, and assigns Workers to sort and merge the files. Sorting in the group stage should happen by line not by key. If there are more files than Workers, the Master should attempt to balance the files evenly among them. If there are fewer files than Workers, it is okay if some Workers sit idle during this stage. Each Worker will be responsible for merging some number of files into one larger file. The Master will then take these files, merge them into one larger file, and then partition that file into the correct number of files for the reducers. The messages sent to the Workers should look like this:{"message_type": "new_sort_job", "input_files": [list of strings], "output_file": string, "worker_pid": int}Once the Worker has finished, it should send back a message formatted as follows:{"message_type": "status", "output_file" : string, "status": "finished" "worker_pid": int}The name of the intermediate files produced the merged files each Worker creates and the single large file the Master creates are up to you. However, once the Master has split up the single input file into the files used for reducing, they must be named reducex , where x is the reduce task number. If there are 4 reduce jobs specified, the master should create reduce1, reduce2, reduce3, reduce4 in the grouper output directory.MapReduce代写Reducing [Workers]To the Worker, this is the same as the map stage it doesn’t need to know if it is running a map or reduce task. The Worker just runs the executable it is told to run the Master is responsible for making sure it tells the Worker to run the correct map or reduce executable. The output_directory in the reduce stage will always be the reducer-output folder. Again, use the same output file name as the input file.Once a Worker has finished its job, it should send a TCP message to the Master’s main socket of the form:{"message_type": "status", "output_files" : [list of strings], "status": "finished""worker_pid": int}Wrapping Up [Master]As soon as the master has received the last “finished” message for the reduce tasks for a given job, the Master should move the output files from the reducer-output directory to the final output directory specified by the original job creation message (The value specified by the output_directory key). In the final output directory, the files should be renamed finaloutputx , where x is the final output file number. If there are 4 final output files, the master should rename them finaloutput1, finaloutput2, finaloutput3, finaloutput4 . Create the output directory if it doesn’t already exist. Check the job queue for the next available job, or go back to listening for jobs if there isn’t one currently in the job queue.MapReduce代写Shutdown [Master + Worker]The Master can also receive a special message to initiate server shutdown. The shutdown message will be of the following form and will be received on the main TCP socket:{"message_type": "shutdown"}The Master should forward this message to all of the Workers that have registered with it. The Workers, upon receiving the shutdown message, should terminate as soon as possible. If the Worker is already in the middle of executing a task, it is okay for it to complete that task before being able to handle the shutdown message as both these happen inside a single thread.After forwarding the message to all Workers, the Master should terminate itself. At this point, you should be able to pass test_shutdownFault tolerance + Heartbeats [Master + Worker]Workers can die at any time and may not finish jobs that you send them. Your Master must accommodate for this. If a Worker misses more than 5 pings in a row, you should assume that it has died, and assign whatever work it was responsible for to another Worker machine.MapReduce代写Each Worker will have a heartbeat thread to send updates to Master via UDP. The messages should look like this, and should be sent every 2 seconds:{"message_type": "heartbeat", "worker_pid": int}If a Worker dies before completing all the tasks assigned to it, then all of those tasks (completed or not) should be redistributed to live Workers. At each point of the execution (mapping, grouping, reducing) the Master should attempt to evenly distribute work among available Workers. If a Worker dies while it is executing a task, the Master will have to assign that task to another Worker. You should mark the failed Worker as dead, but do not remove it from the Master’s internal data structures. This is due to constraints on the Python dictionary data structure. It can result in an error when keys are modified while iterating over the dictionary. For more info on this, please refer to this link.Your Master should attempt to maximize concurrency, but avoid duplication that is, don’t send the same job to different Workers until you know that the Worker who was previously assigned that task has died.At this point, you should be able to pass test_master_3 , test_master_4 , test_worker_2 ,test_integration_1 , test_integration_2 , and test_integration_3 .Walk-through exampleSee a complete example here.TestingTo aid in writing test cases we have included a IntegrationManager class which is similar to the manager the autograder will use to test your submissions. You can find this in the starter file tests/integration_manager.py .In addition, we have provided a simple word count map and reduce example. You can use these executables, as well as the sample data provided, and compare your server’s output with the result obtained by running:$ cat tests/input/* | ./tests/exec/wc_map.sh | sort | $ ./tests/exec/wc_reduce.sh correct.txt MapReduce代写This will generate a file called correct.txt with the final answers, and they must match your server’s output, as follows:$ cat tmp/job-{id}/reducer-output/* | sort output.txt$ diff output.txt correct.txtNote that these executables can be in any language your server should not limit us to running map and reduce jobs written in python3! To help you test this, we have also provided you with a word count solution written in bash (see section below).Note that the autograder will swap out your Master for our Master in order to test the Worker (and vice versa). Your code should have no other dependency besides the communication spec, and the messages sent in your system must match those listed in this spec exactly.Run the public unit tests.$ pwd/Users/awdeorio/src/eecs485/p4-mapreduce$ pytest -svNote that the -s flag has been added to the pytest command in order to also show any messages printed to stdout (such as the logging messages), to help with debugging.Test for busy waiting MapReduce代写A solution that busy-waits may pass on your development machine and fail on the autograder due to a timeout. Your laptop is probably much more powerful than the restricted autograder environment, so you might not notice the performance problem locally. See the Processes, Threads and Sockets in Python Tutorial for an explanation of busy-waiting.To detect busy waiting, time a master without any workers. After a few seconds, kill it by pressing Control C several times. Ignore any errors or exceptions. We can tell that this solution busy-waits because the user time is similar to the real time.$ time mapreduce-master 6000 INFO:root:Starting master:6000 real 0m4.475suser 0m4.429ssys 0m0.039sThis example does not busy wait. Notice that the user time is small compared to the real time.$ time mapreduce-master 6000 INFO:root:Starting master:6000 MapReduce代写 real 0m3.530suser 0m0.275ssys 0m0.036sTesting Fault-ToleranceTesting for fault tolerance is a major and tricky part of this project. This section will provide some basic guidelines on how you can verify if your system handles fault-tolerance in the desired manner.To enact the condition of a dead Worker, it is important for a Worker to die while performing a task (or in other words for the Master to realize that a Worker has missed more than 5 consecutive heartbeat messages when a still incomplete task was assigned to that Worker, post which it should declare the Worker dead and reassign the task). We have given you slow running executables of map and reduce tasks in tests/exec/wc_map_slow.sh and tests/exec/wc_reduce_slow.sh . These scripts make use of sleep statements. You can choose a sleep time that you feel gives you enough time to kill a Worker while it is execting a task.MapReduce代写The idea is to start your server and send a slow job to the Master. Once the task has been assigned to a Worker, since there are sleep statements in the map/reduce scripts, you should have enough time to manually kill a Worker and then see if the Master can still make forward progress (handle the dead Worker and still produce the correct output).For example, imagine a scenario, where there are 2 Workers, each executing one slow map task task respectively. Now, the second Worker dies amidst this execution (because you manually killed the process made possible due to the sleep times in the map code). In this scenario, how many mapping tasks should the first Worker receive? How many mapping tasks should the second Worker have received? How many sorting and reducing tasks should the first and the second Worker receive? If your code gives expected answers to these questions, then you are in good shape.Code StyleAs in previous projects, all Python code should contain no errors or warnings from pycodestyle ,pydocstyle , and pylint .You may not use any external dependencies aside from what is provided in setup.py .Test Case DescriptionsMany of the autograder test cases in this project are visible on the autograder, but the source code is not published. We can’t publish the source code because many unit tests combine instructor code (e.g., master) with your code, (e.g., worker). This section provides a description of each test case lacking published source code.test_master_1 :Starts student master and one instructorworkerVerifies master received workerregistrationVerifies master can send worker registrationacknowledgementtest_master_2 :Starts student master and one instructorworkerSubmits a word count job to the master input_directory: tests/input/ mapper_executable: tests/exec/wc_map.shreducer_executable: tests/exec/wc_reduce.sh num_mappers: 2num_reducers: 1 MapReduce代写Verifies master created the correct job directorystructuretest_master_3 :Starts student master and one instructorworkerSubmits a word count job to the master input_directory: tests/input/ mapper_executable: tests/exec/wc_map.shreducer_executable: tests/exec/wc_reduce.sh num_mappers: 2num_reducers: 1Verifies master created the correct job directorystructureVerifies master sent sort job toworkerVerifies output iscorrecttest_master_4 :Starts student master and one instructorworkerSubmits a word count job to the master input_directory: tests/input/ mapper_executable: tests/exec/wc_map.shreducer_executable: tests/exec/wc_reduce.sh num_mappers: 2num_reducers: 1 MapReduce代写Verifies master created the correct job directorystructureVerifies master sent a map job to theworkerVerifies master sent a sort job to theworkerVerifies master sent a reduce job to theworkerVerifies final output iscorrecttest_worker_1 :Starts instructor master and one studentworkerVerifies student worker process isrunningVerifies instructor master received worker register message after 2secondstest_worker_2 :Starts instructor master and one studentworkerVerifies student worker process isrunningVerifies instructor master received register message after 2secondsVerifies instructor master received heartbeat messages fromworkertest_worker_3 :Start instructor master and one studentworkerVerifies student worker process isrunningVerifies instructor master received register message after 2secondsSubmits a word count map job to worker executable: tests/exec/wc_map.sh input_files: input/file01output_directory: tmp/test_worker_3/output/Verifies instructor master received “finished” message fromworkertest_worker_4 :Starts instructor master and one studentworkerVerifies student worker process isrunningVerifies instructor master received register message after 2secondsSubmits a word count map job to worker executable: tests/exec/wc_map.sh input_files: input/file02output_directory: tmp/test_worker_4/output/MapReduce代写Verifies instructor master received “finished” message fromworkerDiff check worker-generated output file for correctnesstest_worker_5 :Starts instructor master and one studentworkerVerifies student worker process isrunningVerifies instructor master received register message after 2secondsSubmits a mapping word count job to worker executable: tests/exec/wc_map.sh input_files: input/file01, input/file02output_directory: tmp/test_worker_5/output/Verifies instructor master received “finished” message fromworkerDiff check worker-generated output file for correctnessSubmitting and grading MapReduce代写One team member should register your group on the autograder using the create new invitationfeature.Submit a tarball to the autograder, which is linked from https://eecs485.org. Include the disable- copyfile flag only on macOS.$ tar disable-copyfile exclude * pycache * exclude *tmp* -czvf submit.tar.gz setup.py in mapreduceRubricThis is an approximate rubric.

所有的编程代写范围:essayghost为美国、加拿大、英国、澳洲的留学生提供C语言代写、代写C语言、C语言代做、代做C语言、数据库代写、代写数据库、数据库代做、代做数据库、Web作业代写、代写Web作业、Web作业代做、代做Web作业、Java代写、代写Java、Java代做、代做Java、Python代写、代写Python、Python代做、代做Python、C/C++代写、代写C/C++、C/C++代做、代做C/C++、数据结构代写、代写数据结构、数据结构代做、代做数据结构等留学生编程作业代写服务。